NotaGen
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
Install / Use
/learn @ElectricAlexis/NotaGenREADME
🎵 NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
<p align="center"> <!-- ArXiv --> <a href="https://arxiv.org/abs/2502.18008"> <img src="https://img.shields.io/badge/NotaGen_Paper-ArXiv-%23B31B1B?logo=arxiv&logoColor=white" alt="Paper"> </a> <!-- HuggingFace --> <a href="https://huggingface.co/ElectricAlexis/NotaGen"> <img src="https://img.shields.io/badge/NotaGen_Weights-HuggingFace-%23FFD21F?logo=huggingface&logoColor=white" alt="Weights"> </a> <!-- HuggingFace Space --> <a href="https://huggingface.co/spaces/ElectricAlexis/NotaGen"> <img src="https://img.shields.io/badge/NotaGen_Space-Huggingface-✨️?logo=huggingface&logoColor=white" alt="Space"> </a> <!-- Web Demo --> <a href="https://electricalexis.github.io/notagen-demo/"> <img src="https://img.shields.io/badge/NotaGen_Demo-Web-%23007ACC?logo=google-chrome&logoColor=white" alt="Demo"> </a> </p> <p align="center"> <img src="notagen.png" alt="NotaGen" width="50%"> </p>📖 Overview
NotaGen is a symbolic music generation model that explores the potential of producing high-quality classical sheet music. Inspired by the success of Large Language Models (LLMs), NotaGen adopts a three-stage training paradigm:
- 🧠 Pre-training on 1.6M musical pieces
- 🎯 Fine-tuning on ~9K classical compositions with
period-composer-instrumentationprompts - 🚀 Reinforcement Learning using our novel CLaMP-DPO method (no human annotations or pre-defined rewards required.)
Check our demo page and enjoy music composed by NotaGen!
⚙️ Environment Setup
conda create --name notagen python=3.10
conda activate notagen
conda install pytorch==2.3.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install accelerate
pip install optimum
pip install -r requirements.txt
🏋️ NotaGen Model Weights
Pre-training
We provide pre-trained weights of different scales: | Models | Parameters | Patch-level Decoder Layers | Character-level Decoder Layers | Hidden Size | Patch Length (Context Length) | | ---- | ---- | ---- | ---- | ---- | ---- | | NotaGen-small | 110M | 12 | 3 | 768 | 2048 | | NotaGen-medium | 244M | 16 | 3 | 1024 | 2048 | | NotaGen-large | 516M | 20 | 6 | 1280 | 1024 |
Notice: The pre-trained weights cannot be used for conditional generation based on 'period-composer-instrumentation'.
Fine-tuning
We fine-tuned NotaGen-large on a corpus of approximately 9k classical pieces. You can download the weights here.
Reinforcement-Learning
After pre-training and fine-tuning, we optimized NotaGen-large with 3 iterations of CLaMP-DPO. You can download the weights here.
🌟 NotaGen-X
Inspired by Deepseek-R1, we further optimized the training procedures of NotaGen and released a better version --- NotaGen-X. Compared to the version in the paper, NotaGen-X incorporates the following improvements:
- We introduced a post-training stage between pre-training and fine-tuning, refining the model with a classical-style subset of the pre-training dataset.
- We removed the key augmentation in the Fine-tune stage, making the instrument range of the generated compositions more reasonable.
- After RL, we utilized the resulting checkpoint to gather a new set of post-training data. Starting from the pre-trained checkpoint, we conducted another round of post-training, fine-tuning, and reinforcement learning.
If you want to add a new composer style to NotaGen-X, please refer to issue #18 for more instructions :D
🎹 Demo
Online Gradio Demo
We developed an online gradio demo on Huggingface Space for NotaGen-X. You can input "Period-Composer-Instrumentation" as the prompt to have NotaGen generate music, preview the audio / pdf scores, and download them :D
<p align="center"> <img src="gradio/illustration_online.png" alt="NotaGen Gradio Demo"> </p>Local Gradio Demo
We developed a local Gradio demo for NotaGen-X. You can input "Period-Composer-Instrumentation" as the prompt to have NotaGen generate music!
<p align="center"> <img src="gradio/illustration.png" alt="NotaGen Gradio Demo"> </p>Deploying NotaGen-X inference locally may require 8GB of GPU memory. For implementation details, please view gradio/README.md. We are also working on developing an online demo.
Online Colab Notebook
Thanks for @deeplearn-art's contribution of a Google Colab notebook for NotaGen! You can run it and access to a Gradio public link to play with this demo. 🤩
ComfyUI
Thanks for @billwuhao's contribution of a ComfyUI node for NotaGen! It can automatically convert generated .abc to .xml, .mp3, and .png formats. You can listen to the generated music and see the sheet music too! Please visit the repository page for more information. 🤩
<p align="center"> <img src="https://github.com/billwuhao/ComfyUI_NotaGen/blob/master/images/2025-03-10_06-24-03.png" alt="NotaGen ComfyUI"> </p>🛠️ Data Pre-processing & Post-processing
For converting ABC notation files from / to MusicXML files, please view data/README.md for instructions.
To illustrate the specific data format, we provide a small dataset of Schubert's lieder compositions from the OpenScore Lieder, which includes:
- 🗂️ Interleaved ABC folders
- 🗂️ Augmented ABC folders
- 📄 Data index files for training and evaluation
You can download it here and put it under data/.
In the instructions of Fine-tuning and Reinforcement Learning below, we will use this dataset as an example of our implementation. It won't include the "period-composer-instrumentation" conditioning, just for showing how to adapt the pretrained NotaGen to a specific music style.
🧠 Pre-train
If you want to use your own data to pre-train a blank NotaGen model, please:
- Preprocess the data and generate the data index files following the instructions in data/README.md
- Modify the parameters in
pretrain/config.py
Use this command for pre-training:
cd pretrain/
accelerate launch --multi_gpu --mixed_precision fp16 train-gen.py
🎯 Fine-tune
Here we give an example on fine-tuning NotaGen-large with the Schubert's lieder data mentioned above.
Notice: The use of NotaGen-large requires at least 24GB of GPU memory for training and inference. Alternatively, you may use NotaGen-small or NotaGen-medium and change the configuration of models in finetune/config.py.
Configuration
- In
finetune/config.py:- Modify the
DATA_TRAIN_INDEX_PATHandDATA_EVAL_INDEX_PATH:# Configuration for the data DATA_TRAIN_INDEX_PATH = "../data/schubert_augmented_train.jsonl" DATA_EVAL_INDEX_PATH = "../data/schubert_augmented_eval.jsonl" - Download pre-trained NotaGen weights, and modify the
PRETRAINED_PATH:PRETRAINED_PATH = "../pretrain/weights_notagen_pretrain_p_size_16_p_length_1024_p_layers_20_c_layers_6_h_size_1280_lr_0.0001_batch_4.pth" # Use NotaGen-large EXP_TAGis for differentiating the models. It will be integrated into the ckpt's name. Here we set it toschubert.- You can also modify other parameters like the learning rate.
- Modify the
Execution
Use this command for fine-tuning:
cd finetune/
CUDA_VISIBLE_DEVICES=0 python train-gen.py
🚀 Reinforcement Learning (CLaMP-DPO)
Here we give an example on how to use CLaMP-DPO to enhance the model fine-tuned with Schubert's lieder data.
⚙️ CLaMP 2 Setup
Download model weights and put them under the clamp2/folder:
🔍 Extract Ground Truth Features
Modify ```input_
