AiMIDI
ai driven midi note generation
Install / Use
/learn @jimsweb/AiMIDIREADME
ai_midi
<img width="1536" height="1024" alt="ae0905c4-a4c6-4f1a-82be-8cde466723c5" src="https://github.com/user-attachments/assets/cfa74dae-584c-44d4-9699-4ca44835290b" />ai_midi is a Python toolkit for training and generating MIDI with REMI-like tokenization, a GPT-style Transformer powered by PyTorch Lightning, and Hydra-based configuration. It ships with CLI tools to preprocess datasets, train models, sample new music, and even an optional FastAPI server.
Table of Contents
- Features
- Installation
- Quickstart
- Configuration
- Dataset Preparation Tips
- Troubleshooting & Tuning
- Testing
- License
Features
- REMI-inspired tokenization with combined
Note(pitch,duration)tokens, velocity bins, program changes, time shifts, and special tokens. - Lightning-powered Transformer training (AMP, checkpointing, cosine/linear LR schedule, gradient clipping).
- Sliding-window dataset with on-disk caching.
- Flexible sampling: top-k, top-p, temperature, greedy, and EOS stopping.
- Command-line tools and optional FastAPI server.
- Hydra configs, structured logging, and reproducible seeding.
Installation
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e .[dev]
pre-commit install
See PyTorch's installation guide for platform-specific torch wheels.
Quickstart
- Place MIDI files in a folder, e.g.
data/midi/. - (Optional) Preprocess and cache sequences:
ai_midi preprocess --midi-dir data/midi --out-cachedir data/cache
- Train a model:
ai_midi train
Override any config via Hydra:
ai_midi train train.max_epochs=2 model.d_model=256 data.seq_len=256
- Generate new music from a prompt:
ai_midi generate --prompt examples/sample_prompt.mid --out out.mid --max-tokens 256 --temp 1.0 --top_k 50 --top_p 0.95
Configuration
Configurations live in src/midigegen/config/ and are composed with Hydra. The base config.yaml includes model.yaml, data.yaml, and train.yaml. Override any key from the CLI.
Dataset Preparation Tips
- Use well-quantized MIDI with consistent tempos.
- Remove unneeded metadata tracks; focus on relevant melodic or drum tracks.
- Match the tokenizer's step size to your musical grid (e.g., 1/32 or 1/64).
- Normalize tempos when training.
Troubleshooting & Tuning
- Reduce
model.d_model,model.n_layers,data.seq_len, ortrain.batch_sizeif memory is tight. - Start with greedy or top-k sampling, then adjust
temperatureandtop_pfor diversity. train.warmup_stepscan stabilize early training.- Augment or transpose MIDI for more variety.
Testing
Run the tests:
pytest -q
License
Released under the MIT license. See LICENSE for details.
Related Skills
node-connect
348.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
348.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
348.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
