SkillAgentSearch skills...

MuseDiffusion

YAI 11 x @POZAlabs : Music generation & modification from Unclear midi SEquence with Diffusion model

Install / Use

/learn @YAIxPOZAlabs/MuseDiffusion

README

<!-- HEADER START --> <!-- src: https://github.com/kyechan99/capsule-render --> <p align="center"><a href="#"> <img width="100%" height="100%" src="https://capsule-render.vercel.app/api?type=waving&color=0:B993D6,100:8CA6DB&height=220&section=header&fontSize=40&fontColor=ffffff&animation=fadeIn&fontAlignY=40&text=%E2%97%A6%20%CB%9A%20%20%EF%BC%AD%20%EF%BD%95%EF%BD%93%EF%BD%85%20%EF%BC%A4%EF%BD%89%EF%BD%86%EF%BD%86%EF%BD%95%EF%BD%93%EF%BD%89%EF%BD%8F%EF%BD%8E%20%CB%9A%20%E2%97%A6" alt="header" /> </a></p> <h3 align="center">Music generation from Unclear midi SEquence with Diffusion model</h3> <p align="center"><a href="https://github.com/YAIxPOZAlabs"><img src="assets/figure00_logo.png" width=50% height=50% alt="logo"></a></p> <p align="center">This project was carried out by <b><a href="https://github.com/yonsei-YAI">YAI 11th</a></b>, in cooperation with <b><a href="https://github.com/POZAlabs">POZAlabs</a></b>.</p> <p align="center"> <br> <a href="mailto:dhakim@yonsei.ac.kr"> <img src="https://img.shields.io/badge/-Gmail-D14836?style=flat-square&logo=gmail&logoColor=white" alt="Gmail"/> </a> <a href="https://dhakim.notion.site/1e7dc19fd1064e698a389f75404883c7"> <img src="https://img.shields.io/badge/-Project%20Page-000000?style=flat-square&logo=notion&logoColor=white" alt="NOTION"/> </a> <a href="./README.pdf"> <img src="https://img.shields.io/badge/-Full%20Report-dddddd?style=flat-square&logo=latex&logoColor=black" alt="REPORT"/> </a> </p> <p align="center"> <br> <a href="./README.pdf"> 🔎 For the details, please refer to <b>Project Full Report</b>. </a> </p> <br> <hr> <!-- HEADER END --> <h3 align="center"><br>✨&nbsp; Contributors&nbsp; ✨<br><br></h3> <p align="center"> <b>🛠️ <a href="https://github.com/kdha0727">KIM DONGHA</a></b>&nbsp; :&nbsp; YAI 8th&nbsp; /&nbsp; AI Dev Lead &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br> <b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;🚀 <a href="https://github.com/ta3h30nk1m">KIM TAEHEON</a></b>&nbsp; :&nbsp; YAI 10th&nbsp; /&nbsp; AI Research & Dev <br> <b>👑 <a href="https://github.com/san9min">LEE SANGMIN</a></b>&nbsp; :&nbsp; YAI 9th&nbsp; /&nbsp; Team Leader&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br> &nbsp;<b>🐋 <a href="https://github.com/Tim3s">LEE SEUNGJAE</a></b>&nbsp; :&nbsp; YAI 9th&nbsp; /&nbsp; AI Research Lead <br> <b>🌈 <a href="https://github.com/jeongwoo1213">CHOI JEONGWOO</a></b>&nbsp; :&nbsp; YAI 10th&nbsp; /&nbsp; AI Research & Dev <br> <b>🌟 <a href="https://github.com/starwh03">CHOI WOOHYEON</a></b>&nbsp; :&nbsp; YAI 10th&nbsp; /&nbsp; AI Research & Dev <br> <br><br> <hr> <h3 align="center"><br>🎼 Generated Samples 🎵<br><br></h3> <div align="center">

https://user-images.githubusercontent.com/61076953/223916377-c4c317b5-66dc-49a0-b42a-20132f638128.mp4

</div>

<br><hr>

<h2> How to run</h2> <h3>0. Clone repository and cd</h3>
git clone https://github.com/YAIxPOZAlabs/MuseDiffusion.git
cd MuseDiffusion
<br> <h3>1. Prepare environment and data</h3> <h4>Set environment with python 3.8 and install pytorch</h4>
python3 -m pip install virtualenv && \
python3 -m virtualenv venv --python=python3.8 && \
source venv/bin/activate && \
pip3 install -r requirements.txt
<details> <summary>(Optional) If required, install python 3.8 for venv usage.</summary> &nbsp;
sudo apt update && \
sudo apt install -y software-properties-common && \
sudo add-apt-repository -y ppa:deadsnakes/ppa && \
sudo apt install -y python3.8 python3.8-distutils
</details> <details> <summary>(Optional) If anaconda is available, you can set environments by anaconda instead of given code.</summary> &nbsp;
conda create -n MuseDiffusion python=3.8 pip wheel
conda activate MuseDiffusion
pip3 install -r requirements.txt
</details> <details> <summary>(Recommended) <b>If docker is available, use Dockerfile instead</b>.</summary> &nbsp;
docker build -f Dockerfile -t musediffusion:v1 .
</details> <br> <h3>2. Download and Preprocess dataset</h3>
python3 -m MuseDiffusion dataprep
  • If you want to use custom commu-like dataset, make dataset to npy files (refer to this issue) and preprocess it by this command.
python3 -m MuseDiffusion dataprep --data_dir path/to/dataset
<details> <summary>After this step, your directory structure would be like:</summary> &nbsp;
MuseDiffusion
├── MuseDiffusion
│   ├── __init__.py
│   ├── config
│   │   ├── __init__.py
│   │   ├── __main__.py
│   │   └── base.py
│   ├── data
│   │   ├── __init__.py
│   │   ├── __main__.py
│   │   ├── corruption.py
│   │   └── ...
│   ├── models
│   │   ├── __init__.py
│   │   ├── denoising_model.py
│   │   ├── gaussian_diffusion.py
│   │   ├── nn.py
│   │   └── ...
│   ├── run
│   │   ├── __init__.py
│   │   ├── sample_generation.py
│   │   ├── sample_seq2seq.py
│   │   └── train.py
│   └── utils
│       ├── __init__.py
│       ├── decode_util.py
│       ├── dist_util.py
│       ├── train_util.py
│       └── ...
├── assets
│   └── (files for readme...)
├── commu
│   └── (same code as https://github.com/POZAlabs/ComMU-code/blob/master/commu/)
├── datasets
│   └── ComMU-processed
│       └── (preprocessed commu dataset files...)
├── scripts
│   ├── run_train.sh
│   ├── sample_seq2seq.sh
│   └── sample_generation.sh
├── README.md
└── requirements.txt
</details> <br> <h3>3. Prepare model weight and configuration</h3> <h4>With downloading pretrained one</h4>
mkdir diffusion_models
mkdir diffusion_models/pretrained_weights
cd diffusion_models/pretrained_weights
wget https://github.com/YAIxPOZAlabs/MuseDiffusion/releases/download/1.0.0/pretrained_weights.zip
unzip pretrained_weights.zip && rm pretrained_weights.zip
cd ../..
<h4>With Manual Training</h4>
python3 -m MuseDiffusion train --distributed
<details> <summary>How to customize arguments</summary> <h5>&nbsp; Method 1: Using JSON Config File</h5>
  • With --config_json train_cfg.json required arguments above will be automatically loaded.
# Copy config file to root directory
python3 -c "from MuseDiffusion.config import TrainSettings as T; print(T().json(indent=2))" \
>> train_cfg.json

# Customize config on your own
vi train_cfg.json

# Run training script
python3 -m MuseDiffusion train --distributed --config_json train_cfg.json
<h5>&nbsp; Method 2: Using Arguments</h5>
  • Add your arguments refer to python3 -m MuseDiffusion train --help.

Refer to example below:

python3 -m MuseDiffusion train --distributed \
--lr 0.0001 \
--batch_size 2048 \
--microbatch 64 \
--learning_steps 320000 \
--log_interval 20 \
--save_interval 1000 \
--eval_interval 500 \
--ema_rate 0.5,0.9,0.99 \
--seed 102 \
--diffusion_steps 2000 \
--schedule_sampler lossaware \
--noise_schedule sqrt \
--seq_len 2096 \
--pretrained_denoiser diffuseq.pt \
--pretrained_embedding pozalabs_embedding.pt \
--freeze_embedding false \
--use_bucketing true \
--dataset ComMU \
--data_dir datasets/ComMU-processed \
--data_loader_workers 4 \
--use_corruption true \
--corr_available mt,mn,rn,rr \
--corr_max 4 \
--corr_p 0.5 \
--corr_kwargs "{'p':0.4}" \
--hidden_t_dim 128 \
--hidden_dim 128 \
--dropout 0.4 \
--weight_decay 0.1 \
--gradient_clipping -1.0
</details> <details> <summary>With regard to <b><u>--distributed</u></b> argument (torch.distributed runner)</summary> <h5>&nbsp; Arguments related with torch.distributed:</h5>
  • Argument --distributed will run python -m MuseDiffusion train with torch.distributed runner
    • you can customize options, or environs.
  • commandline option --nproc_per_node - number of training node (GPU) to use.
    • default: number of GPU in CUDA_VISIBLE_DEVICES environ.
  • commandline option --master_port - master port for distributed learning.
    • default: will automatically be found if available, otherwise 12233
  • environ CUDA_VISIBLE_DEVICES - specific GPU index. e.g: CUDA_VISIBLE_DEVICES=4,5,6,7
    • default: not set - in this case, trainer will use all available GPUs.
  • environ OPT_NUM_THREADS - number of threads for each node.
    • default: will automatically be set to $CPU_CORE / / $TOTAL_GPU
  • In windows, torch.distributed is disabled in default. to enable, edit USE_DIST_IN_WINDOWS flag in MuseDiffusion/utils/dist_util.py.

Refer to example below:

CUDA_VISIBLE_DEVICES=4,5,6,7 python3 -m MuseDiffusion train --distributed --master_port 12233
</details>

After training, weights and configs will be saved into ./diffusion_models/{name-of-model-folder}/.

<br> <h3>4. Sample with model - Modify or Generate Midi!</h3> <h4>From corrupted samples</h4>
python3 -m MuseDiffusion modification --distributed \
--use_corruption True \
--corr_available rn,rr \
--corr_max 2 \
--corr_p 0.5 \
--step 500 \
--strength 0.75 \
--model_path ./diffusion_models/{name-of-model-folder}/{weight-file}
  • You can use arguments for torch.distributed, which is same as training script.
  • Type python3 -m MuseDiffusion modification --help for detailed usage.
  • You can omit --model_path argument, if you want to use pretrained weights.
<h4>From metadata</h4>
python3 -m MuseDiffusion generation --distributed \
--bpm {BPM} \
--audio_key {AUDIO_KEY} \
--time_signature {TIME_SIGNATURE} \
--pitch_range {PITCH_RANGE} \
--num_measures {NUM_MEASURES} \
--inst {INST} \
--genre {GENRE} \
--min_velocity {MIN_VELOCITY} \
--max_velocity {MAX_VELOCITY} \
--track_role {TRACK_ROLE} \
--rhythm {RHYTHM} \
--chord_progression {CHORD_PROGRESSION} \
--num_samples 1000 \
--step 500 \
--model_path diffusion_models/{name-of-model-folder}/{weight-file}
  • In generation, MidiMeta arguments (bpm, audio_key, ..., chord_progression) are essential.
  • You can use arguments for torch.distributed, which is same as training script.
  • Type python3 -m MuseDiffusion generation --help for detailed usage.
  • **You can omit --model_path argument, if yo

Related Skills

View on GitHub
GitHub Stars26
CategoryDevelopment
Updated6mo ago
Forks3

Languages

Python

Security Score

72/100

Audited on Sep 25, 2025

No findings