Wav2aug

A general purpose task-agnostic speech augmentation policy

Generate Convert Improve

Install / Use

/learn @gfdb/Wav2aug

About this skill

Quality Score

0/100

README

🎛️ Wav2Aug: Task-Agnostic Waveform Augmentation Beyond ASR

A minimalistic PyTorch-based audio augmentation library for speech augmentation. The goal of this library is to provide a general purpose speech augmentation policy that can be used on any task and perform reasonably well without having to tune augmentation hyperparameters. Just install, and start augmenting. Applies two random augmentations per call. Just install and start augmenting!

Diagram

📦 Installation

pip

pip install wav2aug

uv

uv add wav2aug

🚀 Quick Start

import torch
from wav2aug.gpu import Wav2Aug

# Initialize the augmenter once
augmenter = Wav2Aug(sample_rate=16000)

# in the forward pass
wavs = torch.randn(3, 50000)
lens = torch.ones((wavs.size(0)))

aug_wavs, aug_lens = augmenter(wavs, lens)

# or just

aug_wavs = augmenter(wavs)

That's it!

🧪 Augmentation Types

🔊 Amplitude Scaling/Clipping: Random gain and peak limiting
🌫️ Noise Addition: Environmental noise with SNR control
📶 Frequency Dropout: Spectral masking with random notch filters
🔄 Polarity Inversion: Random phase flip
🧩 Chunk Swapping: Temporal segment reordering
⏱️ Speed Perturbation: Time-scale modification
🕳️ Time Dropout: Random silence insertion
👥 Babble Noise: Multi-speaker background (auto-enabled with sufficient buffer)

Randomness: all stochastic ops use PyTorch RNGs. Set a single seed and be done, e.g. torch.manual_seed(0); torch.cuda.manual_seed_all(0)

🛠️ Development Installation

uv

git clone https://github.com/gfdb/wav2aug
cd wav2aug

# create venv and pin Python
uv venv
source .venv/bin/activate
uv python pin 3.10  # or 3.11/3.12

# runtime only
uv sync

# extras
uv sync --extra dev
uv sync --extra test

pip

git clone https://github.com/gfdb/wav2aug
cd wav2aug

# create venv
python -m venv .venv
source .venv/bin/activate

# runtime only
python -m pip install .

# editable + extras for development
python -m pip install -e '.[dev,test]'

✅ Tests

uv

uv run pytest -q tests/

pip

pytest -q tests/

🤝 Contributing

Issues and PRs are welcome and encouraged!
Bug reports: please open an issue with a minimal repro (env, dep versions, code snippet, expected vs. actual, traceback, etc.)
Feature requests: please open an issue with use-case and proposed feature.
PRs: Add tests for new stuff or when behavior changes. Also, don't forget to run formatters and tests before submitting!

Related Skills

qqbot-channel

350.8k

QQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口，自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。

docs-writer

100.5k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

350.8k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

ddd

Guía de Principios DDD para el Proyecto > 📚 Documento Complementario : Este documento define los principios y reglas de DDD. Para ver templates de código, ejemplos detallados y guías paso

gfdb

View profile

View on GitHub

GitHub Stars16

CategoryContent

Updated10d ago

Forks1

gfdb/wav2aug

Languages

Python

Security Score

95/100

Audited on Mar 27, 2026

No findings