Easy and Efficient Fine-tuning LLMs --- 简单高效的大语言模型训练/部署

中文 | English

</div>

Introduction

LLamaTuner is an efficient, flexible and full-featured toolkit for fine-tuning LLM (Llama3, Phi3, Qwen, Mistral, ...)

Efficient

Support LLM, VLM pre-training / fine-tuning on almost all GPUs. LLamaTuner is capable of fine-tuning 7B LLM on a single 8GB GPU, as well as multi-node fine-tuning of models exceeding 70B.
Automatically dispatch high-performance operators such as FlashAttention and Triton kernels to increase training throughput.
Compatible with DeepSpeed 🚀, easily utilizing a variety of ZeRO optimization techniques.

Flexible

Support various LLMs (Llama 3, Mixtral, Llama 2, ChatGLM, Qwen, Baichuan, ...).
Support VLM (LLaVA).
Well-designed data pipeline, accommodating datasets in any format, including but not limited to open-source and custom formats.
Support various training algorithms (QLoRA, LoRA, full-parameter fune-tune), allowing users to choose the most suitable solution for their requirements.

Full-featured

Support continuous pre-training, instruction fine-tuning, and agent fine-tuning.
Support chatting with large models with pre-defined templates.

Easy and Efficient Fine-tuning LLMs --- 简单高效的大语言模型训练/部署

Supported Models

| Model | ---------------------------------------------------- | Baichuan | Baichuan2 | BLOOM | BLOOMZ | ChatGLM3 | Command-R | DeepSeek | Falcon | Gemma/CodeGemma | InternLM2 | LLaMA | LLaMA-2 | LLaMA-3 | LLaVA-1.5 | Mistral/Mixtral | OLMo | PaliGemma | Phi-1.5/2 | Phi-3 | Qwen | Qwen1.5 (Code/MoE) | StarCoder2 | XVERSE | Yi (1/1.5) | Yi-VL | Yuan | Model size | Default module | Template | | -------------------------------- | --------------- | --------- | | 7B/13B | W_pack | baichuan | | 7B/13B | W_pack | baichuan2 | | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - | | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value | - | | 6B | query_key_value | chatglm3 | | 35B/104B | q_proj,v_proj | cohere | (MoE) | 7B/16B/67B/236B | q_proj,v_proj | deepseek | | 7B/11B/40B/180B | query_key_value | falcon | | 2B/7B | q_proj,v_proj | gemma | | 7B/20B | wqkv | intern2 | | 7B/13B/33B/65B | q_proj,v_proj | - | | 7B/13B/70B | q_proj,v_proj | llama2 | | 8B/70B | q_proj,v_proj | llama3 | | 7B/13B | q_proj,v_proj | vicuna | | 7B/8x7B/8x22B | q_proj,v_proj | mistral | | 1B/7B | q_proj,v_proj | - | | 3B | q_proj,v_proj | gemma | | 1.3B/2.7B | q_proj,v_proj | - | | 3.8B | qkv_proj | phi | | 1.8B/7B/14B/72B | c_attn | qwen | | 0.5B/1.8B/4B/7B/14B/32B/72B/110B | q_proj,v_proj | qwen | | 3B/7B/15B | q_proj,v_proj | - | | 7B/13B/65B | q_proj,v_proj | xverse | | 6B/9B/34B | q_proj,v_proj | yi | | 6B/34B | q_proj,v_proj | yi_vl | | 2B/51B/102B | q_proj,v_proj | yuan |

Supported Training Approaches

| Approach | Full-tuning | Freeze-tuning | LoRA | QLoRA | | ---------------------- | ------------------ | ------------------ | ------------------ | ------------------ | | Pre-Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | Supervised Fine-Tuning | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | Reward Modeling | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | PPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | DPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | KTO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | ORPO Training | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |

Supported Datasets

As of now, we support the following datasets, most of which are all available in the Hugging Face datasets library.

<details><summary>Supervised fine-tuning dataset</summary>

</details> <details><summary>Preference datasets</summary>

DPO mixed (en&zh)
Orca DPO Pairs (en)
[HH-RLHF (en)](https://huggingface.co/datasets/A

LLamaTuner

Install / Use

README

Easy and Efficient Fine-tuning LLMs --- 简单高效的大语言模型训练/部署

Introduction

Table of Contents

Supported Models

Supported Training Approaches

Supported Datasets