MixCycle
This repository contains the audio samples and the source code that accompany the paper: "MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training"
Install / Use
/learn @ertug/MixCycleREADME
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training
This repository contains the audio samples and the source code that accompany the paper.
Audio samples
We provide audio samples to demonstrate the results of the MixCycle method on two different datasets: LibriMix and REAL-M.
Also note that the provided REAL-M samples were used in the informal listening test.
We also provide audio samples from the baseline methods on LibriMix: PIT-DM and MixIT.
Source code
We provide the source code under the src directory for reproducibility.
Running the experiments
Prepare the datasets
Create the environment
Install Anaconda and run the following command:
$ conda env create -f environment.yml
See more info on how to manage conda environments.
Activate the environment
$ conda activate mixcycle
Run the experiments
$ cd src
$ python experiment.py --librimix-root ~/datasets/librimix --exp-root ~/experiments --run librimix_irm
$ python experiment.py --librimix-root ~/datasets/librimix --exp-root ~/experiments --run librimix_5p
$ python experiment.py --librimix-root ~/datasets/librimix --exp-root ~/experiments --run librimix_100p
$ python experiment.py --librimix-root ~/datasets/librimix --realm-root ~/datasets/REAL-M-v0.1.0 --exp-root ~/experiments --run realm
Optionally, you can monitor the training process with TensorBoard by running:
$ tensorboard --logdir experiments
Citation (BibTeX)
If you find this repository useful, please cite our work:
@article{karamatli2022unsupervised,
title={MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training},
author={Karamatl{\i}, Ertu{\u{g}} and K{\i}rb{\i}z, Serap},
journal={IEEE Signal Processing Letters},
volume={29},
number={},
pages={2637-2641},
year={2022},
doi={10.1109/LSP.2022.3232276}
}
Related Skills
node-connect
352.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
