8.5 KiB
AI-driven Kernel Generator (AIKG)
📋 Table of Contents
📘 1. Project Overview
AIKG is an AI-driven kernel generator that leverages the code generation capabilities of Large Language Models (LLMs). Through LLM-based planning and control of (multi-)agents, AIKG collaboratively accomplishes multi-backend, multi-type AI kernel generation and automatic optimization. Additionally, AIKG provides a rich set of submodules for kernel agents, which enables users to build custom agent tasks.
🗓️ 2. Changelog
- 2025-10-14: Added TileLang_CUDA DSL support. See Benchmark Results for KernelBench Level1 success rates.
- 2025-09-26: Added CUDA C and C++ DSL support. See Benchmark Results for KernelBench Level1 success rates.
- 2025-09-14: KernelBench Level1 kernel generation success rate updated, see Benchmark Results.
- 2025-08-12: Introduced Doc-Driven Integration; by following a unified documentation specification, you can quickly and flexibly integrate new DSLs/frontends/backends (see Doc-Driven Integration Guide).
- 2025-06-27: Initial AIKG release with code generation support for Triton and SWFT backends.
🛠️ 3. Installation & Deployment Guide
# 1. Environment Setup
# 1.1 Create conda environment (optional, recommended Python 3.9/3.10/3.11)
conda create -n aikg python=3.11
conda activate aikg
# 1.2 Or create virtual environment (optional)
python -m venv .venv
source .venv/bin/activate
# 2. Install dependencies via pip
pip install -r requirements.txt
# 3. whl installation / environment setup
# 3.1 Install from whl
bash build.sh
pip install output/ai_kernel_generator-*-py3-none-any.whl
# 3.2 Or setup environment variables
cd aikg
source env.sh
⚙️ 4. Configuration
Configuration Quick Guide
Step 1: Basic Environment Configuration
API and Model Configuration
AIKG uses environment variables to set the API keys for various Large Language Model (LLM) services. Please configure the appropriate environment variables based on the service you are using:
# VLLM (https://github.com/vllm-project/vllm)
export AIKG_VLLM_API_BASE=http://localhost:8000/v1
# Other API interfaces. For detailed supported list, please refer to docs/API.md
export AIKG_XXX_API_KEY=xxx
# Ollama (https://ollama.com/)
export AIKG_OLLAMA_API_BASE=http://localhost:11434
Additional configuration options:
- Task Orchestration Plan Configuration: Declares a task's complete runtime scheme (including
agent_model_config,workflow_config_path,docs_dir, etc.). Common plan files:default_triton_cuda_config.yaml,default_triton_ascend_config.yaml,vllm_triton_cuda_coderonly_config.yaml,vllm_triton_ascend_coderonly_config.yaml. See Task Orchestration Plan Configuration. - Model Configuration:
llm_config.yamlcontains preset configurations for various LLM providers (DeepSeek, Qwen, Moonshot, etc.). Theagent_model_configin the plan references presets from this file. - Workflow Definition: Specify the workflow YAML via
workflow_config_pathto define agent execution order and constraints (e.g.,default_workflow.yaml,coder_only_workflow.yaml). See Workflow System Design Document. - Doc-Driven Integration: Provide reference docs for agents via the plan's
docs_dir. See Doc-Driven Integration Guide.
For detailed configuration instructions, please refer to API Configuration Documentation.
Third-party Dependencies
This project uses git submodules to manage certain third-party dependencies.
After initial cloning or pulling updates, please use the following command to initialize and download aikg-related dependencies:
# Initialize and pull aikg-related submodules
git submodule update --init "aikg/thirdparty/*"
Step 2: Frontend Dependencies Configuration
MindSpore 2.7 Frontend Dependencies(Optional)
Supported Python versions: 3.11, 3.10, 3.9 Supported system architectures: aarch64, x86_64 Prefer the official installation guide to choose environment and method: MindSpore 2.7 Installation Guide
Step 3: Backend Dependencies Configuration
Choose the appropriate backend based on your hardware platform:
| Platform | Backend | Reference Link |
|---|---|---|
| Huawei Atlas A2 Training Series | Triton | https://gitee.com/ascend/triton-ascend |
| NVIDIA GPU | Triton | https://github.com/triton-lang/triton |
| Huawei Atlas Inference Series | SWFT | https://gitee.com/mindspore/akg/tree/br_aikg/swft |
Step 4: Optional Tools Configuration
Similarity Detection Dependencies
The text similarity detection tool text2vec-large-chinese: If the model cannot be loaded automatically, manually download it to the thirdparty directory. After downloading the model, add its local path to the corresponding YAML configuration in the database. For detailed configuration instructions, please refer to the DataBase documentation.
bash download.sh --with_local_model
💡 Configuration Tips:
- For detailed API configuration, please refer to API Documentation
- For database configuration, please refer to DataBase Documentation
- For more configuration options, please refer to the dedicated documentation for each component
▶️ 5. Tutorial Examples
Below are common examples in the examples/ directory:
| Example | Description |
|---|---|
run_mindspore_triton_single.py |
Single operator example (MindSpore + Triton, Ascend 910B4). |
run_mindspore_triton_parallel.py |
Parallel multi-operator example (MindSpore + Triton, Ascend 910B4). |
run_numpy_swft_relu.py |
SWFT ReLU example (Ascend 310P3). |
run_numpy_swft_swiglu.py |
SWFT SwiGLU example (Ascend 310P3). |
For more getting started steps and parameter notes, please refer to the Tutorial.
📐 6. Design Documentation
We recommend reading the Task Orchestration Plan Configuration first for the overall task plan and entry points; workflow details are in Workflow and documentation specs are in Doc-Driven Integration.
Core Framework
- Task - Task management module
- Trace - Execution tracking module
- TaskPool - Task pool management
- DevicePool - Device pool management
- DataBase - Database module
Core Components
Backend Support
- SWFT Backend - Huawei Atlas inference series backend
- Triton Backend - Triton compute backend
