Results for "speech-recognition-library"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

119 skills found · Page 1 of 4

zzmp / Juliusjs

2.6k

A speech recognition library for the web

universal

Updated 2mo ago

cmusphinx / Sphinx4

1.4k

Pure Java speech recognition library

universal

Updated 22d ago

sdkcarlos / Artyom.js

1.3k

A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.

universal

recognitionspeech-recognitionspeech-synthesis+2

Updated 9d ago

alphacep / Vosk Server

1.2k

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

universal

asrgrpckaldi+6

Updated 2d ago

alphacep / Vosk Android Demo

1.0k

Offline speech recognition for Android with Vosk library.

universal

androidasrkaldi+3

Updated 3d ago

astorfi / Speechpy

886

:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

universal

feature-extractionpythonspeech-recognition+1

Updated 1mo ago

ccoreilly / Vosk Browser

507

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

universal

asrkaldispeech-recognition+6

Updated 3d ago

echogarden-project / Echogarden

441

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.

universal

command-lineforced-alignmentlanguage-detection+11

Updated 15d ago

gionanide / Speech Signal Processing And Classification

257

Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].

Kaljurand / Dictate.js

217

A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.

universal

javascriptkaldi-gstreamer-serverrecorderjs+2

Updated 9mo ago

sksalahuddin2828 / AI Personal Digital Assistant

213

AI Personal Voice Assistant Project (Male - Female version)

universal

artificial-intelligencecolorama-librairydatetime+13

Updated 17d ago

Bear-03 / Vosk Rs

171

Rust bindings to the Vosk API Speech Recognition library

universal

Updated 6d ago

muaz-khan / Translator

138

Translator.js is a JavaScript library built top on Google Speech-Recognition & Translation API to transcript and translate voice and text. It supports many locales and brings globalization in WebRTC! https://www.webrtc-experiment.com/Translator/

universal

webrtcwebrtc-demoswebrtc-experiments

Updated 1mo ago

H2CO3 / Libsprec

134

C library for speech recognition using the Google Speech API

universal

Updated 6mo ago

at16k / At16k

130

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

universal

asrasr-modelautomatic-speech-recognition+8

Updated 6mo ago

vikramezhil / DroidSpeech

127

Android library for continuous speech recognition

universal

Updated 1mo ago

alphacep / Vosk Unity Asr

120

Automatic Speech Recognition in Unity using Vosk library

universal

asrdeepspeechspeech-recognition+3

Updated 10d ago

nassosoassos / Sail Align

SailAlign is an open-source software toolkit for robust long speech-text alignment implementing an adaptive, iterative speech recognition and text alignment scheme that allows for the processing of very long (and possibly noisy) audio and is robust to transcription errors. It is mainly written as a perl library but its functionality also depends on freely available software, namely HTK, srilm and sclite.

universal

Updated 5mo ago

riderodd / React Native Vosk

Speech recognition module for react native using Vosk library

universal

asrreact-nativespeech-recognition+1

Updated 4d ago

abhishek305 / PyBot A ChatBot For Answering Python Queries Using NLP

Pybot can change the way learners try to learn python programming language in a more interactive way. This chatbot will try to solve or provide answer to almost every python related issues or queries that the user is asking for. We are implementing NLP for improving the efficiency of the chatbot. We will include voice feature for more interactivity to the user. By utilizing NLP, developers can organize and structure knowledge to perform tasks such as automatic summarization, translation, named entity recognition, relationship extraction, sentiment analysis, speech recognition, and topic segmentation. NLTK has been called “a wonderful tool for teaching and working in, computational linguistics using Python,” and “an amazing library to play with natural language.The main issue with text data is that it is all in text format (strings). However, the Machine learning algorithms need some sort of numerical feature vector in order to perform the task. So before we start with any NLP project we need to pre-process it to make it ideal for working. Converting the entire text into uppercase or lowercase, so that the algorithm does not treat the same words in different cases as different Tokenization is just the term used to describe the process of converting the normal text strings into a list of tokens i.e words that we actually want. Sentence tokenizer can be used to find the list of sentences and Word tokenizer can be used to find the list of words in strings.Removing Noise i.e everything that isn’t in a standard number or letter.Removing Stop words. Sometimes, some extremely common words which would appear to be of little value in helping select documents matching a user need are excluded from the vocabulary entirely. These words are called stop words.Stemming is the process of reducing inflected (or sometimes derived) words to their stem, base or root form — generally a written word form. Example if we were to stem the following words: “Stems”, “Stemming”, “Stemmed”, “and Stemtization”, the result would be a single word “stem”. A slight variant of stemming is lemmatization. The major difference between these is, that, stemming can often create non-existent words, whereas lemmas are actual words. So, your root stem, meaning the word you end up with, is not something you can just look up in a dictionary, but you can look up a lemma. Examples of Lemmatization are that “run” is a base form for words like “running” or “ran” or that the word “better” and “good” are in the same lemma so they are considered the same.