MFTCoder

High Accuracy and efficiency multi-task fine-tuning framework for Code LLMs. This work has been accepted by KDD 2024.

Generate Convert Improve

Install / Use

/learn @codefuse-ai/MFTCoder

About this skill

Quality Score

0/100

README

MFTCoder: High Accuracy and Efficiency Multi-task Fine-Tuning Framework

<p align="center"> <img src="./assets/github-codefuse-logo-update.jpg" width="50%" /> </p> <div align="center"> <p> <a href="https://github.com/codefuse-ai/MFTCoder"> <img alt="stars" src="https://img.shields.io/github/stars/codefuse-ai/MFTCoder?style=social" /> </a> <a href="https://github.com/codefuse-ai/MFTCoder"> <img alt="forks" src="https://img.shields.io/github/forks/codefuse-ai/MFTCoder?style=social" /> </a> <a href="https://github.com/codefuse-ai/MFTCoder/LICENCE"> <img alt="License: MIT" src="https://badgen.net/badge/license/apache2.0/blue" /> </a> <a href="https://github.com/codefuse-ai/MFTCoder/issues"> <img alt="Open Issues" src="https://img.shields.io/github/issues-raw/codefuse-ai/MFTCoder" /> </a> </p> <p> 🤗 <a href="https://huggingface.co/codefuse-ai" target="_blank">HuggingFace </a>• 🤖<a href="https://modelscope.cn/organization/codefuse-ai" target="_blank"> ModelScope </a> </p>

[中文] [English]

</div>

News
Articles
Introduction
Requirements
Training
Models
Datasets
Star History
Join Us

News

🔥🔥🔥 [2024/10/31] We released MFTCoder v0.5 mainly for MFTCoder-accelerate, which is now supporting preference alignment methods like DPO/RPO/ORPO in the new xxpo module, adding full-parameter continue-training in the additional mpt module along with its offline_tokenization module, updating selfpaced method to new convergence balance(CoBa) method for MFT in the original pefts module.

🔥🔥🔥 [2024/10/31] Our paper CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models has been accepted by EMNLP-2024, which achieves balanced convergence across various tasks.

🔥🔥🔥 [2024/05/20] We released MFTCoder v0.4, mainly for MFTCoder-accelerate. It supports QLoRA + DeepSpeed Zero3 and QLoRA + FSDP as options allowing you training very large models. It now supports new models like Qwen2, Qwen2-MoE, Starcoder2, Gemma, etc.

🔥🔥🔥 [2024/05/20] Our paper MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning has been accepted by KDD2024.

🔥🔥🔥 [2024/05/20] CodeFuse-StarCoder2-15B has been released, achieving a pass@1 (greedy decoding) score of 73.2% on HumanEval.

🔥🔥 [2024/01/30] The model CodeFuse-DeepSeek-33B fine-tuned with MFTCoder ranks first in HuggingFace Big Code Models LeaderBoard

🔥🔥 [2024/01/17] We released MFTCoder v0.3.0, mainly for MFTCoder-accelerate. It now supports new models like Mixtral(MoE), DeepSeek-coder, chatglm3. It supports FSDP as an option. It also supports Self-paced Loss as a solution for convergence balance in Multitask Fine-tuning.

🔥🔥 [2024/01/17] CodeFuse-DeepSeek-33B has been released, achieving a pass@1 (greedy decoding) score of 78.7% on HumanEval. It lists as top-1 LLM on Bigcode Leardboard in terms of win-rate, the official result is going to be published later.

🔥🔥 [2024/01/17] CodeFuse-Mixtral-8x7B has been released, achieving a pass@1 (greedy decoding) score of 56.1% on HumanEval.

🔥🔥 [2023/11/07] MFTCoder Paper has been released on Arxiv, which discloses technique details of multi-task-fine-tuning.

🔥🔥 [2023/10/20] CodeFuse-QWen-14B has been released, achieving a pass@1 (greedy decoding) score of 48.8% on HumanEval, which gains 16% absolute improvement over the base model Qwen-14b

🔥🔥 [2023/09/27] CodeFuse-StarCoder-15B has been released, achieving a pass@1 (greedy decoding) score of 54.9% on HumanEval.

🔥🔥 [2023/09/26]We are pleased to announce the release of the 4-bit quantized version of CodeFuse-CodeLlama-34B. Despite the quantization process, the model still achieves a remarkable 73.8% accuracy (greedy decoding) on the HumanEval pass@1 metric.

🔥🔥 [2023/09/07]We released CodeFuse-CodeLlama-34B, which achieves the 74.4% Python Pass@1 (greedy decoding) and surpasses GPT4 (2023/03/15) and ChatGPT-3.5 on the HumanEval Benchmarks.

🔥🔥 [2023/08/26]We released MFTCoder-v0.1.0 which supports finetuning Code Llama, Llama, Llama2, StarCoder, ChatGLM2, CodeGeeX2, Qwen, and GPT-NeoX models with LoRA/QLoRA.

HumanEval Performance

| Model | HumanEval(Pass@1) | Date | |:----------------------------|:-----------------:|:-------:| | CodeFuse-DeepSeek-33B | 78.7% | 2024/01 | | CodeFuse-CodeLlama-34B | 74.4% | 2023/09 | | CodeFuse-CodeLlama-34B-4bits | 73.8% | 2023/09 | | CodeFuse-StarCoder2-15B | 73.2% | 2023/05 | | WizardCoder-Python-34B-V1.0 | 73.2% | 2023/08 | | GPT-4(zero-shot) | 67.0% | 2023/03 | | PanGu-Coder2 15B | 61.6% | 2023/08 | | CodeFuse-Mixtral-8x7B | 56.1% | 2024/01 | | CodeFuse-StarCoder-15B | 54.9% | 2023/08 | | CodeLlama-34b-Python | 53.7% | 2023/08 | | CodeFuse-QWen-14B | 48.8% | 2023/10 | | CodeLlama-34b | 48.8% | 2023/08 | | GPT-3.5(zero-shot) | 48.1% | 2022/11 | | OctoCoder | 46.2% | 2023/08 | | StarCoder-15B | 33.6% | 2023/05 | | QWen-14B | 32.3% | 2023/10 |

Articles

MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning (KDD2024)

Introduction

High Accuracy and efficiency Multi-task Fine-tuning framework for Code LLMs.

MFTCoder is an open-source project of CodeFuse for accurate and efficient Multi-task Fine-tuning(MFT) on Large Language Models(LLMs), especially on Code-LLMs(large language model for code tasks). Moreover, we open source Code LLM models and code-related datasets along with the MFTCoder framework.

In MFTCoder, we released two codebases for finetuning Large Language Models:

MFTCoder-accelerate is a framework with accelerate and DeepSpeed/FSDP. All tech-stacks are open-source and vibrant. We highly recommend you try this framework and make your fintuning accurate and efficient.
MFTCoder-atorch is based on the ATorch frameworks, which is a fast distributed training framework of LLM.

The aim of this project is to foster collaboration and share advancements in large language models, particularly within the domain of code development.

Frameworks

Highlights

:white_check_mark: Multi-task: Train models on multiple tasks while maintaining a balance between them. The models can even generalize to new, previously unseen tasks.

:white_check_mark: Multi-model: It integrates state-of-the-art open-source models such as gpt-neox, llama, llama-2, baichuan, Qwen, chatglm2, and more. (These finetuned models will be released in the near future.)

:white_check_mark: Multi-framework: It provides support for both Accelerate (with Deepspeed and FSDP) and ATorch

:white_check_mark: Efficient fine-tuning: It supports LoRA, QLoRA as well as Full-parameters training, enabling fine-tuning of large models with minimal resources. The training speed meets the demands of almost all fine-tuning scenarios.

The main components of this project include:

Support for both SFT (Supervised FineTuning) and MFT (Multi-task FineTuning). The current MFTCoder achieves data balance among multiple tasks, and future releases will achieve a balance between task difficulty and convergence speed during training.
Support for QLoRA instruction fine-tuning, LoRA fine-tuning as well as Full-parameters fine-tuning.
Support for most mainstream open-source large models, particularly those relevant to Code-LLMs, such as DeepSeek-coder, Mistral, Mixtral, Chatglm3, Code-LLaMA, Starcoder, Codegeex2, Qwen, GPT-Neox, and more.
Support for weight merging between the LoRA adaptor and base models, simplifying the inference process.
Release of 2 high-quality code-related instruction fine-tuning datasets: Evol-instruction-66k and CodeExercise-Python-27k.
Release of many Code LLMs, please refer to organizations: codefuse-ai on huggingface or codefuse-ai on modelscope.

Requirements

To begin, ensure that you have successfully installed CUDA (version >= 11.4, preferably 12.1) along with the necessary drivers. Additionally, make sure you have installed torch (version >= 2.1.0).

Next, we have provided an init_env.sh script to simplify the installation of required packages. Execute the following command to run the script:

sh init_env.sh

We highly recommend training with flash attention(version >= 2.3.0), please refer to the following link for installation instructions: https://github.com/Dao-AILab/flash-a

Related Skills

openhue

334.1k

Control Philips Hue lights and scenes via the OpenHue CLI.

sag

334.1k

ElevenLabs text-to-speech with mac-style say UX.

weather

334.1k

Get current weather and forecasts via wttr.in or Open-Meteo

tweakcc

1.4k

Customize Claude Code's system prompts, create custom toolsets, input pattern highlighters, themes/thinking verbs/spinners, customize input box & user message styling, support AGENTS.md, unlock private/unreleased features, and much more. Supports both native/npm installs on all platforms.

codefuse-ai

View profile

View on GitHub

GitHub Stars713

CategoryCustomer

Updated2d ago

Forks69

codefuse-ai/MFTCoder

Languages

Python

Security Score

85/100

Audited on Mar 22, 2026

No findings

MFTCoder

Install / Use

README

MFTCoder: High Accuracy and Efficiency Multi-task Fine-Tuning Framework

Contents

News

HumanEval Performance

Articles

Introduction

Frameworks

Highlights

Requirements

Related Skills