SkillAgentSearch skills...

SpecAugmentPyTorch

A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Install / Use

/learn @IMLHF/SpecAugmentPyTorch
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

SpecAugment License

An implementation of SpecAugment for Pytorch

How to use

Install pytorch (version==1.6.0 is used for testing).

import torch
from spec_augment_pytorch import SpecAugmentTorch
from spec_augment_pytorch import visualization_spectrogram
p = {'W':40, 'F':29, 'mF':2, 'T':50, 'p':1.0, 'mT':2, 'batch':False}
specaug_fn = SpecAugmentTorch(**p)

# [batch, c, frequency, n_frame], c=1 for magnitude or mel-spec, c=2 for complex stft
complex_stft = torch.randn(1, 1, 257, 150) 
complex_stft_aug = specaug_fn(complex_stft) # [b, c, f, t]
visualization_spectrogram(complex_stft_aug[0][0], "blabla")

run command python spec_augment_pytorch.py to generate examples (processed wav and visual spectrogram).

<p align="center"> <img src="./examples/1089-0001.png" alt="1089-0001: spectrogram"/ width=85%> <img src="./examples/1089-0001-SpecAug.png" alt="1089-0001-SpecAug: augmented spectrogram"/ width=85%> <img src="./examples/1089-0002.png" alt="1089-0002: spectrogram"/ width=85%> <img src="./examples/1089-0002-SpecAug.png" alt="1089-0002-SpecAug: augmented spectrogram"/ width=85%> </p>

Reference

[1] DemisEom/SpecAugment

[2] zcaceres/spec_augment issue17

[3] SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

View on GitHub
GitHub Stars12
CategoryProduct
Updated3mo ago
Forks3

Languages

Python

Security Score

72/100

Audited on Dec 5, 2025

No findings