PyACA
Python scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
Install / Use
/learn @alexanderlerch/PyACAREADME
pyACA
Python scripts accompanying the book "An Introduction to Audio Content Analysis". The source code shows example implementations of basic approaches, features, and algorithms for music audio content analysis.
All implementations are also available in:
functionality
The top-level functions are (alphabetical):
computeBeatHisto: calculates a simple beat histogramcomputeChords: simple chord recognitioncomputeFeature: calculates instantaneous featurescomputeFingerprint: audio fingerprint extractioncomputeKey: calculates a simple key estimatecomputeMelSpectrogram: computes a mel spectrogramcomputeNoveltyFunction: simple onset detectioncomputePitch: calculates a fundamental frequency estimatecomputeSpectrogram: computes a magnitude spectrogram
The names of the additional functions follow the following conventions:
Feature*: instantaneous featuresPitch*: pitch tracking approachNovelty*: novelty function computationTool*: additional helper functions and basic algorithms such as
- Blocking of audio into overlapping blocks
- Pre-processing audio
- Conversion (freq2bark, freq2mel, freq2midi, mel2freq, midi2freq)
- Filterbank (Gammatone)
- Gaussian Mixture Model
- Principal Component Analysis
- Feature Selection
- Dynamic Time Warping
- K-Means Clustering
- K Nearest Neighbor classification
- Non-Negative Matrix Factorization
- Viterbi algorithm
documentation
The latest full documentation of this package can be found at https://alexanderlerch.github.io/pyACA.
design principles
Please note that the provided code examples are only intended to showcase algorithmic principles – they are not entirely suitable for practical usage without parameter optimization and additional algorithmic tuning. Rather, they intend to show how to implement audio analysis solutions and to facilitate algorithmic understanding to enable the reader to design and implement their own analysis approaches.
minimal dependencies
The required dependencies are reduced to a minimum, more specifically to only numpy and scipy, for the following reasons:
- accessibility, i.e., clear algorithmic implementation from scratch without obfuscation by using 3rd party implementations,
- maintainability through independence of 3rd party code. This design choice brings, however, some limitations; for instance, reading of non-RIFF audio files is not supported and the machine learning models are very simple.
readability
Consistent variable naming and formatting, as well as the choice for simple implementations allow for easier parsing. The readability of the source code will sometimes come at the cost of lower performance.
cross-language comparability
All code is matched exactly with Matlab implementations and the equations in the book. This also means that the python code might violate typical python style conventions in order to be consistent.
related repositories and links
The python source code in this repository is matched with corresponding source code in the Matlab repository. A C++ implementation with identical functionality can be found in the C++ repository.
Other, related repositories are
- ACA-Slides: slide decks for teaching and learning audio content analysis
- ACA-Plots: Matlab scripts for generating all plots in the book and slides
The main entry point to all book-related information is AudioContentAnalysis.org
getting started
installation
pip install pyACA
code examples
example 1: computation and plot of the Spectral Centroid
import pyACA
import matplotlib.pyplot as plt
# file to analyze
cPath = "c:/temp/test.wav"
# extract feature
[v, t] = pyACA.computeFeatureCl(cPath, "SpectralCentroid")
# plot feature output
plt.plot(t,np.squeeze(v))
example 2: Computation of two features (here: Spectral Centroid and Spectral Flux)
import pyACA
# read audio file
cPath = "c:/temp/test.wav"
[f_s, afAudioData] = pyACA.ToolReadAudio(cPath)
# compute feature
[vsc, t] = pyACA.computeFeature("SpectralCentroid", afAudioData, f_s)
[vsf, t] = pyACA.computeFeature("SpectralFlux", afAudioData, f_s)
Related Skills
qqbot-channel
344.1kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
99.8k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
344.1kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
Design
Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t
