🎵 NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms

<a href="https://arxiv.org/abs/2502.18008"> <img src="https://img.shields.io/badge/NotaGen_Paper-ArXiv-%23B31B1B?logo=arxiv&logoColor=white" alt="Paper"> </a>     <a href="https://huggingface.co/ElectricAlexis/NotaGen"> <img src="https://img.shields.io/badge/NotaGen_Weights-HuggingFace-%23FFD21F?logo=huggingface&logoColor=white" alt="Weights"> </a>     <a href="https://huggingface.co/spaces/ElectricAlexis/NotaGen"> <img src="https://img.shields.io/badge/NotaGen_Space-Huggingface-✨️?logo=huggingface&logoColor=white" alt="Space"> </a>     <a href="https://electricalexis.github.io/notagen-demo/"> <img src="https://img.shields.io/badge/NotaGen_Demo-Web-%23007ACC?logo=google-chrome&logoColor=white" alt="Demo"> </a> <img src="notagen.png" alt="NotaGen" width="50%">

📖 Overview

NotaGen is a symbolic music generation model that explores the potential of producing high-quality classical sheet music. Inspired by the success of Large Language Models (LLMs), NotaGen adopts a three-stage training paradigm:

🧠 Pre-training on 1.6M musical pieces
🎯 Fine-tuning on ~9K classical compositions with period-composer-instrumentation prompts
🚀 Reinforcement Learning using our novel CLaMP-DPO method (no human annotations or pre-defined rewards required.)

Check our demo page and enjoy music composed by NotaGen!

⚙️ Environment Setup

conda create --name notagen python=3.10
conda activate notagen
conda install pytorch==2.3.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install accelerate
pip install optimum
pip install -r requirements.txt

🏋️ NotaGen Model Weights

Pre-training

We provide pre-trained weights of different scales: | Models | Parameters | Patch-level Decoder Layers | Character-level Decoder Layers | Hidden Size | Patch Length (Context Length) | | ---- | ---- | ---- | ---- | ---- | ---- | | NotaGen-small | 110M | 12 | 3 | 768 | 2048 | | NotaGen-medium | 244M | 16 | 3 | 1024 | 2048 | | NotaGen-large | 516M | 20 | 6 | 1280 | 1024 |

Notice: The pre-trained weights cannot be used for conditional generation based on 'period-composer-instrumentation'.

Fine-tuning

We fine-tuned NotaGen-large on a corpus of approximately 9k classical pieces. You can download the weights here.

Reinforcement-Learning

After pre-training and fine-tuning, we optimized NotaGen-large with 3 iterations of CLaMP-DPO. You can download the weights here.

🌟 NotaGen-X

Inspired by Deepseek-R1, we further optimized the training procedures of NotaGen and released a better version --- NotaGen-X. Compared to the version in the paper, NotaGen-X incorporates the following improvements:

We introduced a post-training stage between pre-training and fine-tuning, refining the model with a classical-style subset of the pre-training dataset.
We removed the key augmentation in the Fine-tune stage, making the instrument range of the generated compositions more reasonable.
After RL, we utilized the resulting checkpoint to gather a new set of post-training data. Starting from the pre-trained checkpoint, we conducted another round of post-training, fine-tuning, and reinforcement learning.

If you want to add a new composer style to NotaGen-X, please refer to issue #18 for more instructions :D

🎹 Demo

Online Gradio Demo

We developed an online gradio demo on Huggingface Space for NotaGen-X. You can input "Period-Composer-Instrumentation" as the prompt to have NotaGen generate music, preview the audio / pdf scores, and download them :D

Local Gradio Demo

We developed a local Gradio demo for NotaGen-X. You can input "Period-Composer-Instrumentation" as the prompt to have NotaGen generate music！

Deploying NotaGen-X inference locally may require 8GB of GPU memory. For implementation details, please view gradio/README.md. We are also working on developing an online demo.

Online Colab Notebook

Thanks for @deeplearn-art's contribution of a Google Colab notebook for NotaGen! You can run it and access to a Gradio public link to play with this demo. 🤩

ComfyUI

Thanks for @billwuhao's contribution of a ComfyUI node for NotaGen! It can automatically convert generated .abc to .xml, .mp3, and .png formats. You can listen to the generated music and see the sheet music too! Please visit the repository page for more information. 🤩

🛠️ Data Pre-processing & Post-processing

For converting ABC notation files from / to MusicXML files, please view data/README.md for instructions.

To illustrate the specific data format, we provide a small dataset of Schubert's lieder compositions from the OpenScore Lieder, which includes:

🗂️ Interleaved ABC folders
🗂️ Augmented ABC folders
📄 Data index files for training and evaluation

You can download it here and put it under data/.

In the instructions of Fine-tuning and Reinforcement Learning below, we will use this dataset as an example of our implementation. It won't include the "period-composer-instrumentation" conditioning, just for showing how to adapt the pretrained NotaGen to a specific music style.

🧠 Pre-train

If you want to use your own data to pre-train a blank NotaGen model, please:

Preprocess the data and generate the data index files following the instructions in data/README.md
Modify the parameters in pretrain/config.py

Use this command for pre-training:

cd pretrain/
accelerate launch --multi_gpu --mixed_precision fp16 train-gen.py

🎯 Fine-tune

Here we give an example on fine-tuning NotaGen-large with the Schubert's lieder data mentioned above.

Notice: The use of NotaGen-large requires at least 24GB of GPU memory for training and inference. Alternatively, you may use NotaGen-small or NotaGen-medium and change the configuration of models in finetune/config.py.

Configuration

In finetune/config.py:

Modify the DATA_TRAIN_INDEX_PATH and DATA_EVAL_INDEX_PATH:

# Configuration for the data
DATA_TRAIN_INDEX_PATH = "../data/schubert_augmented_train.jsonl" 
DATA_EVAL_INDEX_PATH  = "../data/schubert_augmented_eval.jsonl"

Download pre-trained NotaGen weights, and modify the PRETRAINED_PATH:

PRETRAINED_PATH = "../pretrain/weights_notagen_pretrain_p_size_16_p_length_1024_p_layers_20_c_layers_6_h_size_1280_lr_0.0001_batch_4.pth"  # Use NotaGen-large

EXP_TAG is for differentiating the models. It will be integrated into the ckpt's name. Here we set it to schubert.
You can also modify other parameters like the learning rate.

Execution

Use this command for fine-tuning:

cd finetune/
CUDA_VISIBLE_DEVICES=0 python train-gen.py

🚀 Reinforcement Learning (CLaMP-DPO)

Here we give an example on how to use CLaMP-DPO to enhance the model fine-tuned with Schubert's lieder data.

⚙️ CLaMP 2 Setup

Download model weights and put them under the clamp2/folder:

🔍 Extract Ground Truth Features

Modify ```input_

NotaGen

Install / Use

README