SkillAgentSearch skills...

VocalSeparationAI

This repo is self-learning project that the music and its vocal converted to relative spectograms, then using these spectograms, the vocal seperation AI is trained.

Install / Use

/learn @saitakturk/VocalSeparationAI
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

VocalSeparationAI

This repo is self-learning project that the music and its vocal converted to relative spectograms, then using these spectograms, the vocal seperation AI is trained.

Some Test Results

| Music | Vocal | AI Output | | :------------ |:---------------:| -----:| | | | | |Music Listen | Vocal Listen | AI Output Listen | | | | | | Music Listen | Vocal Listen | AI Output Listen |

The dataset

  • I used DSD100 dataset to get music and vocal.
  • The musics and vocals is splitted 5 sec parts and converted to spectograms. The dataset I created with spectograms

Tools

  • I used provided repo to convert from music to spectogram and spectogram to music
  • I used pix2pix-tensorflow implementation to train the model

Preprocessing Details

1 - The musics and vocals are splitted to 5 sec music parts.

2 - The 5 sec parts are converted to spectogram images. Changed Values to get 255x256 images :

  • Pixels per second : 51
  • Bandwitdh : 205

3 - 1 pixel height is added end of the height( Don't put start ) to get image size 256x256 images.

4 - The parts that do not contain vocals is removed from dataset via removing the images has only 0 pixel values.

Training Details

  • I trained 10 epochs in pix2pix implementation.

Further Improvements and Limitations

I will update this part

  • pix2pixHD.
  • transparency problem of spectograms.

Related Skills

View on GitHub
GitHub Stars7
CategoryProduct
Updated28d ago
Forks0

Security Score

85/100

Audited on Mar 8, 2026

No findings