DeepLearningForAudioWithPython
Code and slides for the "Deep Learning (For Audio) With Python" course on TheSoundOfAI Youtube channel.
Install / Use
/learn @musikalkemist/DeepLearningForAudioWithPythonREADME
Deep Learning For Audio With Python
Code for the "Deep Learning (for Audio) with Python" series on The Sound of AI YouTube channel.
This repository is a comprehensive collection of resources and code for understanding and implementing deep learning models for audio tasks. It serves as a practical guide, starting from the absolute basics (building neurons and backpropagation from scratch), moving to TensorFlow implementation, and culminating in building a complete Music Genre Classification system using various architectures (MLP, CNN, RNN-LSTM).
Note on Versioning
While this v2 release is fully functional and optimized for current environments, it may differ from the original version shown in the course. The codebase has been updated to reflect modern best practices (e.g. TensorFlow 2.16+, Librosa 0.11+) and improved dependency management. Consequently, the original course version has been deprecated; however, it remains available in the legacy branch for those wishing to follow the video content exactly.
Table of Contents
Dataset Setup (GTZAN)
To run the music genre classification lessons (Part 4 & 5), you will need the GTZAN dataset. We provide an automated downloader to handle the acquisition, extraction, and folder organization for you.
- Quick Start: Run
python dataset_downloader.pyfrom the root directory. - Prerequisites: Install requirements.txt.
Full Instructions: Please check the Instructions GTZAN file for detailed help using the downloader script or manual download steps.
Course Structure
Part 1: Fundamentals & Math
- Course Overview: Video | Slides
- AI, Machine Learning and Deep Learning: Video | Slides
- Implementing an Artificial Neuron from Scratch: Video | Slides | Code
- Vector and Matrix Operations: Video | Slides
- Computation in Neural Networks: Video | Slides
Part 2: Neural Networks from Scratch
- Implementing a Neural Network from Scratch: Video | Code
- Training a Neural Network (Backprop & Gradient Descent): Video | Slides
- Implementing Backpropagation from Scratch: Video | Code
Part 3: TensorFlow & Audio Preprocessing
- Implementing a Neural Network with TensorFlow 2: Video | Code
- Understanding Audio Data for Deep Learning: Video | Slides
- Preprocessing Audio Data (MFCCs/Spectrograms): Video | Code
Part 4: Music Genre Classification Project (MLP)
- Preparing the Dataset: Video | Code
- Implementing a Neural Network for Classification: Video | Slides | Code
- Solving Overfitting: Video | Slides | Code
Part 5: Advanced Architectures (CNN & RNN-LSTM)
- Convolutional Neural Networks (CNN) Explained: Video | Slides
- Implementing a CNN for Music Genre Classification: Video | Code
- Recurrent Neural Networks (RNN) Explained: Video | Slides
- Long Short Term Memory (LSTM) Explained: Video | Slides
- Implementing an RNN-LSTM for Music Genre Classification: Video | Code
How to Run the Scripts
To ensure the models and scripts execute correctly, please follow these steps from your terminal:
2. Prepare the Environment (Recommended)
Before running inference, ensure you have the necessary dependencies installed:
pip install -r requirements.txt
2. Navigate to the Lesson Folder
Each class is self-contained. Move into the specific directory for the lesson you are studying:
cd class/folder/name # Replace with the specific class directory
3. Execute the Script
Run the main script using Python:
python mlp.py # Replace with the specific script name
<!-- Reference links for every chapter:
YouTube videos (#yt), PDF-file slides (#sl) and Jupyter Notebooks (#nb) -->Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
flutter-tutor
Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
16.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
