TowardsRobustTranscription
Towards Robust Transcription: Exploring Noise Injection Strategies for Training Data Augmentation
Install / Use
/learn @yonghyunk1m/TowardsRobustTranscriptionREADME
Towards Robust Transcription
Supplementary Materials for ISMIR 2024 LBD
Y. Kim and A. Lerch, "Towards Robust Transcription: Exploring Noise Injection Strategies for Training Data Augmentation," accepted for presentation at the Late-Breaking Demo Session of the 25th International Society for Music Information Retrieval Conference (ISMIR), San Francisco, USA, 2024. [arXiv]
Code
plot_results.ipynb: Generates plots for the figures in the paper.significance_test.ipynb: Performs t-tests to assess the significance of model performance differences between clean data (CNR=∞) and perturbed datasets (CNR={0, 1/3, 1, 3}).
Results
inference_results.md: A Markdown file containing the inference results for Figures 1 and 2 from the paper.inference_results.tex: A LaTeX file presenting the inference results for Figures 1 and 2 from the paper.inference_results.pdf: A PDF-rendered version ofinference_results.tex.significance_test_results.md: A Markdown file presenting t-test results comparing clean (CNR = ∞) and perturbed datasets (CNR = {0, 1/3, 1, 3}).significance_test_results.tex: A LaTeX file presenting t-test results comparing clean (CNR = ∞) and perturbed datasets (CNR = {0, 1/3, 1, 3}).significance_test_results.pdf: A PDF-rendered version ofsignificance_test_results.tex.
Contact
- Yonghyun Kim
Email: yonghyun.kim@gatech.edu
Related Skills
node-connect
334.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
334.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.2kCommit, push, and open a PR
