MuseDiffusion
YAI 11 x @POZAlabs : Music generation & modification from Unclear midi SEquence with Diffusion model
Install / Use
/learn @YAIxPOZAlabs/MuseDiffusionREADME
https://user-images.githubusercontent.com/61076953/223916377-c4c317b5-66dc-49a0-b42a-20132f638128.mp4
</div><br><hr>
<h2> How to run</h2> <h3>0. Clone repository and cd</h3>git clone https://github.com/YAIxPOZAlabs/MuseDiffusion.git
cd MuseDiffusion
<br>
<h3>1. Prepare environment and data</h3>
<h4>Set environment with python 3.8 and install pytorch</h4>
python3 -m pip install virtualenv && \
python3 -m virtualenv venv --python=python3.8 && \
source venv/bin/activate && \
pip3 install -r requirements.txt
<details>
<summary>(Optional) If required, install python 3.8 for venv usage.</summary>
sudo apt update && \
sudo apt install -y software-properties-common && \
sudo add-apt-repository -y ppa:deadsnakes/ppa && \
sudo apt install -y python3.8 python3.8-distutils
</details>
<details>
<summary>(Optional) If anaconda is available, you can set environments by anaconda instead of given code.</summary>
conda create -n MuseDiffusion python=3.8 pip wheel
conda activate MuseDiffusion
pip3 install -r requirements.txt
</details>
<details>
<summary>(Recommended) <b>If docker is available, use Dockerfile instead</b>.</summary>
docker build -f Dockerfile -t musediffusion:v1 .
</details>
<br>
<h3>2. Download and Preprocess dataset</h3>
python3 -m MuseDiffusion dataprep
- If you want to use custom commu-like dataset, make dataset to npy files (refer to this issue) and preprocess it by this command.
python3 -m MuseDiffusion dataprep --data_dir path/to/dataset
<details>
<summary>After this step, your directory structure would be like:</summary>
MuseDiffusion
├── MuseDiffusion
│ ├── __init__.py
│ ├── config
│ │ ├── __init__.py
│ │ ├── __main__.py
│ │ └── base.py
│ ├── data
│ │ ├── __init__.py
│ │ ├── __main__.py
│ │ ├── corruption.py
│ │ └── ...
│ ├── models
│ │ ├── __init__.py
│ │ ├── denoising_model.py
│ │ ├── gaussian_diffusion.py
│ │ ├── nn.py
│ │ └── ...
│ ├── run
│ │ ├── __init__.py
│ │ ├── sample_generation.py
│ │ ├── sample_seq2seq.py
│ │ └── train.py
│ └── utils
│ ├── __init__.py
│ ├── decode_util.py
│ ├── dist_util.py
│ ├── train_util.py
│ └── ...
├── assets
│ └── (files for readme...)
├── commu
│ └── (same code as https://github.com/POZAlabs/ComMU-code/blob/master/commu/)
├── datasets
│ └── ComMU-processed
│ └── (preprocessed commu dataset files...)
├── scripts
│ ├── run_train.sh
│ ├── sample_seq2seq.sh
│ └── sample_generation.sh
├── README.md
└── requirements.txt
</details>
<br>
<h3>3. Prepare model weight and configuration</h3>
<h4>With downloading pretrained one</h4>
mkdir diffusion_models
mkdir diffusion_models/pretrained_weights
cd diffusion_models/pretrained_weights
wget https://github.com/YAIxPOZAlabs/MuseDiffusion/releases/download/1.0.0/pretrained_weights.zip
unzip pretrained_weights.zip && rm pretrained_weights.zip
cd ../..
<h4>With Manual Training</h4>
python3 -m MuseDiffusion train --distributed
<details>
<summary>How to customize arguments</summary>
<h5> Method 1: Using JSON Config File</h5>
- With
--config_json train_cfg.jsonrequired arguments above will be automatically loaded.
# Copy config file to root directory
python3 -c "from MuseDiffusion.config import TrainSettings as T; print(T().json(indent=2))" \
>> train_cfg.json
# Customize config on your own
vi train_cfg.json
# Run training script
python3 -m MuseDiffusion train --distributed --config_json train_cfg.json
<h5> Method 2: Using Arguments</h5>
- Add your arguments refer to
python3 -m MuseDiffusion train --help.
Refer to example below:
python3 -m MuseDiffusion train --distributed \
--lr 0.0001 \
--batch_size 2048 \
--microbatch 64 \
--learning_steps 320000 \
--log_interval 20 \
--save_interval 1000 \
--eval_interval 500 \
--ema_rate 0.5,0.9,0.99 \
--seed 102 \
--diffusion_steps 2000 \
--schedule_sampler lossaware \
--noise_schedule sqrt \
--seq_len 2096 \
--pretrained_denoiser diffuseq.pt \
--pretrained_embedding pozalabs_embedding.pt \
--freeze_embedding false \
--use_bucketing true \
--dataset ComMU \
--data_dir datasets/ComMU-processed \
--data_loader_workers 4 \
--use_corruption true \
--corr_available mt,mn,rn,rr \
--corr_max 4 \
--corr_p 0.5 \
--corr_kwargs "{'p':0.4}" \
--hidden_t_dim 128 \
--hidden_dim 128 \
--dropout 0.4 \
--weight_decay 0.1 \
--gradient_clipping -1.0
</details>
<details>
<summary>With regard to <b><u>--distributed</u></b> argument (torch.distributed runner)</summary>
<h5> Arguments related with torch.distributed:</h5>
- Argument
--distributedwill runpython -m MuseDiffusion trainwith torch.distributed runner- you can customize options, or environs.
- commandline option
--nproc_per_node- number of training node (GPU) to use.- default: number of GPU in
CUDA_VISIBLE_DEVICESenviron.
- default: number of GPU in
- commandline option
--master_port- master port for distributed learning.- default: will automatically be found if available, otherwise
12233
- default: will automatically be found if available, otherwise
- environ
CUDA_VISIBLE_DEVICES- specific GPU index. e.g:CUDA_VISIBLE_DEVICES=4,5,6,7- default: not set - in this case, trainer will use all available GPUs.
- environ
OPT_NUM_THREADS- number of threads for each node.- default: will automatically be set to
$CPU_CORE/ /$TOTAL_GPU
- default: will automatically be set to
- In windows, torch.distributed is disabled in default.
to enable, edit
USE_DIST_IN_WINDOWSflag inMuseDiffusion/utils/dist_util.py.
Refer to example below:
CUDA_VISIBLE_DEVICES=4,5,6,7 python3 -m MuseDiffusion train --distributed --master_port 12233
</details>
After training, weights and configs will be saved into ./diffusion_models/{name-of-model-folder}/.
python3 -m MuseDiffusion modification --distributed \
--use_corruption True \
--corr_available rn,rr \
--corr_max 2 \
--corr_p 0.5 \
--step 500 \
--strength 0.75 \
--model_path ./diffusion_models/{name-of-model-folder}/{weight-file}
- You can use arguments for
torch.distributed, which is same as training script. - Type
python3 -m MuseDiffusion modification --helpfor detailed usage. - You can omit
--model_pathargument, if you want to use pretrained weights.
python3 -m MuseDiffusion generation --distributed \
--bpm {BPM} \
--audio_key {AUDIO_KEY} \
--time_signature {TIME_SIGNATURE} \
--pitch_range {PITCH_RANGE} \
--num_measures {NUM_MEASURES} \
--inst {INST} \
--genre {GENRE} \
--min_velocity {MIN_VELOCITY} \
--max_velocity {MAX_VELOCITY} \
--track_role {TRACK_ROLE} \
--rhythm {RHYTHM} \
--chord_progression {CHORD_PROGRESSION} \
--num_samples 1000 \
--step 500 \
--model_path diffusion_models/{name-of-model-folder}/{weight-file}
- In generation, MidiMeta arguments (bpm, audio_key, ..., chord_progression) are essential.
- You can use arguments for
torch.distributed, which is same as training script. - Type
python3 -m MuseDiffusion generation --helpfor detailed usage. - **You can omit
--model_pathargument, if yo
Related Skills
node-connect
348.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
348.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
348.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
