186 skills found · Page 1 of 7
utterance / Utterances:crystal_ball: A lightweight comments widget built on GitHub issues
kyutai-labs / HibikiHibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- Hibiki adapts its flow to accumulate just enough context to produce a correct translation in real-time, chunk by chunk.
Yue-plus / Hexo Theme Arknights明日方舟罗德岛阵营的 Hexo 主题,支持数学公式、Mermaid图表、多种评论系统(Valine、Gitalk、Waline、Artalk、Utterances、Giscus)
facebookresearch / TaBERTThis repository contains source code for the TaBERT model, a pre-trained language model for learning joint representations of natural language utterances and (semi-)structured tables for semantic parsing. TaBERT is pre-trained on a massive corpus of 26M Web tables and their associated natural language context, and could be used as a drop-in replacement of a semantic parsers original encoder to compute representations for utterances and table schemas (columns).
WeidiXie / VGG Speaker RecognitionUtterance-level Aggregation For Speaker Recognition In The Wild
timmahrt / PraatIOA python library for working with praat, textgrids, time aligned audio transcripts, and audio files. It is primarily used for extracting features from and making manipulations on audio files given hierarchical time-aligned transcriptions (utterance > word > syllable > phone, etc).
lumaku / Ctc SegmentationSegment an audio file and obtain utterance alignments. (Python package)
Shahabks / My Voice AnalysisMy-Voice Analysis is a Python library for the analysis of voice (simultaneous speech, high entropy) without the need of a transcription. It breaks utterances and detects syllable boundaries, fundamental frequency contours, and formants.
cooelf / DeepUtteranceAggregationModeling Multi-turn Conversation with Deep Utterance Aggregation (COLING 2018)
gionanide / Speech Signal Processing And ClassificationFront-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].
chin-gyou / Dialogue Utterance RewriterNo description available
royshil / Obs CleanstreamCleanStream is an OBS plugin that uses AI to clean live audio streams from unwanted words and utterances
alexa-js / Alexa Utterancesgenerate expanded utterances for Amazon Alexa from a template string
liu-nlper / Dialogue Utterance RewriterACL 2019论文复现:Improving Multi-turn Dialogue Modelling with Utterance ReWriter
npow / UbottuNext Utterance Classification
hyungkwonko / Chart LlmVega-Lite Chart Dataset and NL Generation Framework using LLMs
awakening-ai / ReactMotionReactMotion: Generating Reactive Listener Motions from Speaker Utterance
declare-lab / Dialogue UnderstandingThis repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study
declare-lab / Contextual Utterance Level Multimodal Sentiment AnalysisContext-Dependent Sentiment Analysis in User-Generated Videos
utterance / Utterances Oauth:lock: OAuth flow for utterances, utterance-bot APIs