EasyLLM
Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
Install / Use
/learn @ModelTC/EasyLLMREADME
EasyLLM
Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.
Install
-
Install python requirements
pip install -r requirements.txtother dependency
- flash-attn (dropout_layer_norm) (maybe you need to compile it by yourself)
-
Pull deepspeed & add them to pythonpath
export PYTHONPATH=/path/to/DeepSpeed:$PYTHONPATH -
Install package in development mode
pip install -e . -v
Train
Infer and Eval
Support Models
- qwen14b,
- internlm7-20b,
- baichuan1/2 (7b-13b)
- llama1-2 (7b/13b/70b)
Data
3D Parallel config setting
Speed Benchmark
Dynamic Checkpoint
To optimize the model training performance in terms of time and space, EasyLLM supports Dynamic Checkpoint. Based on the input token size, it enables checkpointing for some layers. The configuration file settings are as follows:
License
This repository is released under the Apache-2.0 license.
Acknowledgement
We learned a lot from the following projects when developing EasyLLM.
Related Skills
node-connect
350.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
