<img width="2182" height="602" alt="github+banner-20260130" src=".github/assets/banner-20260130.png" /> [中文版|English]

FlagTree is part of FlagOS, a fully open-source system software stack designed to unify the model–system–chip layers and foster an open and collaborative ecosystem. It enables a "develop once, run anywhere" workflow across diverse AI accelerators, unlocking hardware performance, eliminating fragmentation among AI chipset-specific software stacks, and substantially lowering the cost of porting and maintaining AI workloads.

FlagTree is an open source, unified compiler for multiple AI chips project dedicated to developing a diverse ecosystem of AI chip compilers and related tooling platforms, thereby fostering and strengthening the upstream and downstream Triton ecosystem. Currently in its initial phase, the project aims to maintain compatibility with existing adaptation solutions while unifying the codebase to rapidly implement single-repository multi-backend support. For upstream model users, it provides unified compilation capabilities across multiple backends; for downstream chip manufacturers, it offers examples of Triton ecosystem integration.

Each backend is based on different versions of Triton, and therefore resides in different protected branches. All these protected branches have equal status. CI/CD runners are provisioned for every backend listed in the table.

|Branch|Vendor|Backend|Triton version|Build from source|Source-free Installation| |:-----|:-----|:------|:----------------|:-------------------|:--------------------------| |main|NVIDIA AMD x86_64 cpu ILUVATAR（天数智芯） Moore Threads（摩尔线程） KLX MetaX（沐曦股份） HYGON（海光信息）|nvidia amd triton-shared iluvatar mthreads xpu metax hcu|3.1 3.1 3.1 3.1 3.1 3.0 3.1 3.0|nvidia amd - iluvatar mthreads xpu - hcu|Installation| |triton_v3.2.x|NVIDIA AMD Huawei Ascend（华为昇腾） Cambricon（寒武纪）|nvidia amd ascend cambricon|3.2|nvidia amd ascend -|Installation| |triton_v3.3.x|NVIDIA AMD x86_64 cpu ARM China（安谋科技） Tsingmicro（清微智能） Enflame（燧原）|nvidia amd triton-shared aipu tsingmicro enflame|3.3|nvidia amd - aipu tsingmicro enflame|Installation| |triton_v3.4.x|NVIDIA AMD Sunrise（曦望芯科）|nvidia amd sunrise|3.4|nvidia amd sunrise|Installation| |triton_v3.5.x|NVIDIA AMD Enflame（燧原）|nvidia amd enflame|3.5|nvidia amd enflame|Installation| |triton_v3.6.x|NVIDIA AMD|nvidia amd|3.6|nvidia amd|Installation|

FlagTree’s extension components are currently available on some backends:

|Branch|Backend|Triton version|Extension components| |:-----|:------|:-------------|:-------------------| |triton_v3.6.x|nvidia|3.6|TLE-Lite TLE-Struct GPU TLE-Raw HINTS| |triton_v3.2.x|ascend|3.2|TLE-Struct DSA FLIR HINTS| |triton_v3.3.x|tsingmicro|3.3|TLE-Lite TLE-Struct DSA FLIR| |triton_v3.3.x|aipu|3.3|FLIR HINTS|

TLE (Triton Language Extensions)

Triton provides strong productivity for kernel development, but heterogeneous AI chips and deeper performance tuning scenarios need more explicit control over distributed execution, memory access patterns, and hardware-specific primitives. TLE extends Triton in a layered way to bridge this gap while keeping compatibility with existing Triton workflows.

Key advantages of TLE:

Progressive abstraction from portable usage to hardware-oriented tuning (Lite / Struct / Raw).
Better coverage for multi-device, architecture-specific, and backend lowering scenarios.
Lower migration cost from existing Triton kernels while preserving optimization headroom.

For detailed design, APIs, and examples, please refer to the TLE Wiki and TLE-Raw Wiki.

Latest News

2026/03/13 Added enflame GCU400 backend integration (based on Triton 3.5), and added CI/CD.
2026/01/23 Added sunrise backend integration (based on Triton 3.4), and added CI/CD.
2026/01/08 Add wiki pages for new features HINTS, TLE, TLE-Raw.
2025/12/24 Support pull and install Wheel.
2025/12/08 Added enflame GCU300 backend integration (based on T

FlagTree

Install / Use

README

TLE (Triton Language Extensions)

Latest News

Related Skills