README

VocalSeparationAI

This repo is self-learning project that the music and its vocal converted to relative spectograms, then using these spectograms, the vocal seperation AI is trained.

Some Test Results

| Music | Vocal | AI Output | | :------------ |:---------------:| -----:| | | | | |Music Listen | Vocal Listen | AI Output Listen | | | | | | Music Listen | Vocal Listen | AI Output Listen |

The dataset

I used DSD100 dataset to get music and vocal.
- The Dataset
The musics and vocals is splitted 5 sec parts and converted to spectograms. The dataset I created with spectograms
- The Spectogram Dataset

Tools

I used provided repo to convert from music to spectogram and spectogram to music
- Spectogram Program
I used pix2pix-tensorflow implementation to train the model
- Pix2Pix

Preprocessing Details

1 - The musics and vocals are splitted to 5 sec music parts.

2 - The 5 sec parts are converted to spectogram images. Changed Values to get 255x256 images :

Pixels per second : 51
Bandwitdh : 205

3 - 1 pixel height is added end of the height( Don't put start ) to get image size 256x256 images.

4 - The parts that do not contain vocals is removed from dataset via removing the images has only 0 pixel values.

Training Details

I trained 10 epochs in pix2pix implementation.

Further Improvements and Limitations

I will update this part

pix2pixHD.
transparency problem of spectograms.

VocalSeparationAI

Install / Use

README

VocalSeparationAI

Some Test Results

The dataset

Tools

Preprocessing Details

Training Details

Further Improvements and Limitations

Related Skills