SkillAgentSearch skills...

SpeechRecognition

Classifying sound signals as Links, Midden or Rechts using features computed using a Mel-Frequency filterbank, summing the power of the frequency-domain in the relevant filters. Dynamic Time Warping is used to find proper alignment between the unknown word and several labelled exemplars per word we are looking for. Then, k nearest neighbours tells us which is the most likely class for our unknown word.

Install / Use

/learn @SvenWientjes/SpeechRecognition

README

SpeechRecognition

This repository holds programs designed to perform parts of simple speech recognition. It is designed for a thesis in which three words will have to be recognized ('Links', 'Midden', 'Rechts').

How to test the current version:

  1. Download all the files in the master branch
  2. Place them all in the same folder. Make sure you can access the sound files from your working directory.
  3. Move to that folder using MATLAB (set as wd)
  4. Load up WordClassificationScript.m for 1NN or WordClassKnn.m for 3NN (can easily be extended to more or less)
  5. In the audioread() commands under 'Loading the Signals': change the path to the folder you placed the exemplar files in
  6. Simply click the run button
  7. In the command window a printed string will tell you what it thinks the unknown word is
  8. Three plots appear: they are the Dynamic Time Warping paths of least cost. The title of the plot states which word it belongs to.
  9. Change the audioread() call for 'Woord =' at under 'Loading the Signals' to any other soundbite in your folder to test if it can be classified (this is the test pattern)
  10. Check if classification is correct!

Related Skills

View on GitHub
GitHub Stars7
CategoryContent
Updated1y ago
Forks1

Languages

Matlab

Security Score

60/100

Audited on Dec 17, 2024

No findings