VocalSeparationAI
This repo is self-learning project that the music and its vocal converted to relative spectograms, then using these spectograms, the vocal seperation AI is trained.
Install / Use
/learn @saitakturk/VocalSeparationAIREADME
VocalSeparationAI
This repo is self-learning project that the music and its vocal converted to relative spectograms, then using these spectograms, the vocal seperation AI is trained.
Some Test Results
| Music | Vocal | AI Output |
| :------------ |:---------------:| -----:|
|
|
|
|
|Music Listen | Vocal Listen | AI Output Listen |
|
|
|
|
| Music Listen | Vocal Listen | AI Output Listen |
The dataset
- I used DSD100 dataset to get music and vocal.
- The musics and vocals is splitted 5 sec parts and converted to spectograms. The dataset I created with spectograms
Tools
- I used provided repo to convert from music to spectogram and spectogram to music
- I used pix2pix-tensorflow implementation to train the model
Preprocessing Details
1 - The musics and vocals are splitted to 5 sec music parts.
2 - The 5 sec parts are converted to spectogram images. Changed Values to get 255x256 images :
- Pixels per second : 51
- Bandwitdh : 205
3 - 1 pixel height is added end of the height( Don't put start ) to get image size 256x256 images.
4 - The parts that do not contain vocals is removed from dataset via removing the images has only 0 pixel values.
Training Details
- I trained 10 epochs in pix2pix implementation.
Further Improvements and Limitations
I will update this part
- pix2pixHD.
- transparency problem of spectograms.
Related Skills
next
A beautifully designed, floating Pomodoro timer that respects your workspace.
product-manager-skills
47PM skill for Claude Code, Codex, Cursor, and Windsurf: diagnose SaaS metrics, critique PRDs, plan roadmaps, run discovery, and coach PM career transitions.
devplan-mcp-server
3MCP server for generating development plans, project roadmaps, and task breakdowns for Claude Code. Turn project ideas into paint-by-numbers implementation plans.
Security Score
Audited on Mar 8, 2026
