Used by Amazon, NVIDIA, Aliyun, etc.

Supporters ❤️

| <div style="text-align: center;"><a href="https://warp.dev/llama-factory"><img alt="Warp sponsorship" width="400" src="assets/sponsors/warp.jpg"></a><br><a href="https://warp.dev/llama-factory" style="font-size:larger;">Warp, the agentic terminal for developers</a><br><a href="https://warp.dev/llama-factory">Available for MacOS, Linux, & Windows</a> | <a href="https://serpapi.com"><img alt="SerpAPI sponsorship" width="250" src="assets/sponsors/serpapi.svg"> </a> | | ---- | ---- |

Easily fine-tune 100+ large language models with zero-code CLI and Web UI

</div>

👋 Join our WeChat, NPU, Lab4AI, LLaMA Factory Online user group.

[ English | 中文 ]

Fine-tuning a large language model can be easy as...

https://github.com/user-attachments/assets/3991a3a8-4276-4d30-9cab-4cb0c4b9b99e

Start local training:

Please refer to usage

Start cloud training:

Colab (free): https://colab.research.google.com/drive/1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing
PAI-DSW (free trial): https://gallery.pai-ml.com/#/preview/deepLearning/nlp/llama_factory
LLaMA Factory Online: https://www.llamafactory.com.cn/?utm_source=LLaMA-Factory
Alaya NeW (cloud GPU deal): https://docs.alayanew.com/docs/documents/useGuide/LLaMAFactory/mutiple/?utm_source=LLaMA-Factory

Read technical notes:

Documentation (WIP): https://llamafactory.readthedocs.io/en/latest/
Documentation (AMD GPU): https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/fine_tune/llama_factory_llama3.html
Official Blog: https://blog.llamafactory.net/en/
Official Course: https://www.lab4ai.cn/course/detail?id=7c13e60f6137474eb40f6fd3983c0f46&utm_source=LLaMA-Factory

[!NOTE] Except for the above links, all other websites are unauthorized third-party websites. Please carefully use them.

Features
Blogs
Changelog
Supported Models
Supported Training Approaches
Provided Datasets
Requirement
Getting Started
Projects using LLaMA Factory
License
Citation
Acknowledgement

Features

Various models: LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen3, Qwen3-VL, DeepSeek, Gemma, GLM, Phi, etc.
Integrated methods: (Continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO, KTO, ORPO, etc.
Scalable resources: 16-bit full-tuning, freeze-tuning, LoRA and 2/3/4/5/6/8-bit QLoRA via AQLM/AWQ/GPTQ/LLM.int8/HQQ/EETQ.
Advanced algorithms: GaLore, BAdam, APOLLO, Adam-mini, Muon, OFT, DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ and PiSSA.
Practical tricks: FlashAttention-2, Unsloth, Liger Kernel, KTransformers, RoPE scaling, NEFTune and rsLoRA.
Wide tasks: Multi-turn dialogue, tool using, image understanding, visual grounding, video recognition, audio understanding, etc.
Experiment monitors: LlamaBoard, TensorBoard, Wandb, MLflow, SwanLab, etc.
Faster inference: OpenAI-style API, Gradio UI and CLI with vLLM worker or SGLang worker.

Day-N Support for Fine-Tuning Cutting-Edge Models

| Support Date | Model Name | | ------------ | -------------------------------------------------------------------- | | Day 0 | Qwen3 / Qwen2.5-VL / Gemma 3 / GLM-4.1V / InternLM 3 / MiniCPM-o-2.6 | | Day 1 | Llama 3 / GLM-4 / Mistral Small / PaliGemma2 / Llama 4 |

Blogs

[!TIP] Now we have a dedicated blog for LLaMA Factory!

Website: https://blog.llamafactory.net/en/

<details><summary>All Blogs</summary>

Fine-tune Llama3.1-70B for Medical Diagnosis using LLaMA-Factory (Chinese)
Fine-tune Qwen2.5-VL for Autonomous Driving using LLaMA-Factory (Chinese)
[LLaMA Factory: Fine-tuning the DeepSeek-R1-Distill-Qwen-7B Model for News Classifier](https://gallery.pai-ml.com/#/preview/deepL

LlamaFactory

Install / Use

README