SkillAgentSearch skills...

SANAC

Official repo of ICASSP 2021 paper Source-Aware Neural Speech Coding for Noisy Speech Compression (SANAC)

Install / Use

/learn @haiciyang/SANAC
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Source-Aware Neural Speech Coding for Noisy Speech Compression

Yang, Haici, et al. "Source-Aware Neural Speech Coding for Noisy Speech Compression." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021.

  • Paper: https://arxiv.org/abs/2008.12889
  • Webpage: https://minjekim.com/research-projects/sanac/

Prerequisites

Python 3.6.8<br> torch 1.6.0<br> torchaudio 0.6.0

Dataset

Model training

Main hyper-parameters and their default setting for model training: | Symbol | Description | | --- | ----------- | | filters = 100 | Output channel size of encoder| | d = 1 | Dimension of the codec| | m = 32 | The number of codes in the code book| | sr = True | To do super-resolution based downsampling or not| | lr = 0.0001 | Learning rate | | br = 8 | Bitrate(khz) | | scale = 1000 | Scale to control the hardness of the softmax function. | | label = time.strftime("%m%d_%H%M%S") | Model label| | weight_mse = 30 | Loss weight for MSE(waveforms) term| | weight_mel = 0.5 | Loss weight for mel-spectogram term| | weight_qtz = 0.5 | Loss weight for quantization| | weight_etp_total = 0.1 | Loss weight for the total entropy| | weight_etp_ratio = 0.05 | Loss weight for the entropy ratio between source and noise| | ratio = 1.0 | Ratio of assigned bitrate between source and noise| | update_ratio = False | Whether update the ratio during training or not| | db = 0 | Initial SDR of input data, 0 or 5|

Train proposed model, <code>python3 train_model.py</code>.<br> Train baseline model, <code>python3 train_base.py</code>.

View on GitHub
GitHub Stars5
CategoryDevelopment
Updated1y ago
Forks1

Languages

Jupyter Notebook

Security Score

60/100

Audited on Feb 12, 2025

No findings