SkillAgentSearch skills...

Bregmisi

Phase recovery with the Bregman divergence for audio source separation

Install / Use

/learn @magronp/Bregmisi
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Phase recovery with Bregman divergences for audio source separation

This repository contains the code for reproducing the experiments in our paper entitled Phase recovery with the Bregman divergence for audio source separation, published at the IEEE International Conference on Audio, Speech and Signal Processing (ICASSP) 2021.

Getting the data

After cloning or downloading this repository, you will need to get the speech and noise data to reproduce the results.

  • The speech data is obtained from the VoiceBank dataset available here. You should download the clean_testset_wav.zip file, and unzip it in the data/VoiceBank/ folder. Note that you can change the folder structure, as long as you change the path accordingly in the code.

  • The noise data is obtained from the DEMAND dataset available here. You should download the DLIVING_16k.zip, SPSQUARE_16k.zip and TBUS_16k.zip files, and unzip them in the data/DEMAND/ folder.

Note that you can change the folder structures, as long as you change the speech and noise directory paths accordingly in the code.

Then, simply execute the prepare_data.py script to create the noisy mixtures.

Getting the pre-trained model

To run the experiments, you will need to first estimate the spectrograms of the sources, which is done using the pytorch implementation of the Open Unmix model trained for a speech enhancement task.

The pre-trained model for estimating the speech and noise spectrograms is available here. You should place the .json and .pth files in the open_unmx/ folder. Note that you should also rename the .pth files simply as speech.pth and noise.pth.

Reproducing the experiments

Now that you're all set, simply run the following scripts:

  • validation.py will perform a grid search over the gradient step size on the validation subset to determine its optimal value for every setting. It will also reproduce Fig. 1 from the paper.

  • testing.py will run the algorithms (proposed gradient descent and MISI) on the test subset and plot the results corresponding to Fig. 2 in the paper.

Reference

<details><summary>If you use any of this code for your research, please cite our paper:</summary>
@inproceedings{Magron2021,  
  author={P. Magron and P.-H. Vial and T. Oberlin and C. F{\'e}votte},  
  title={Phase recovery with {B}regman divergences for audio source separation},  
  booktitle={Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},  
  year={2021},
  month={June}
}
</p> </details>

Related Skills

View on GitHub
GitHub Stars6
CategoryDevelopment
Updated2y ago
Forks1

Languages

Python

Security Score

55/100

Audited on Mar 2, 2024

No findings