18 skills found
yeyupiaoling / VoiceprintRecognition PaddlePaddle本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法
narcotic-sh / SenkoVery fast, accurate speaker diarization
csukuangfj / KaldifeatKaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API
csukuangfj / Kaldi Native FbankKaldi-compatible online fbank extractor without external dependencies
mcimpoi / Deep FbanksDeep Filter Banks for Texture Recognition, Description and Segmentation (CVPR15)
echocatzh / Torch MfccA librosa STFT/Fbank/mfcc feature extration written up in PyTorch using 1D Convolutions.
ZitengWang / Python Kaldi Featurespython codes to extract MFCC and FBANK speech features for Kaldi
a-n-rose / Build CNN Or LSTM Or CNNLSTM With Speech FeaturesA set of scripts that extract speech features (so far MFCCs, FBANKs, STFT, and kinda dominant frequency) and trains CNN, LSTM, or CNN+LSTM models with those features.
Magic-Bubble / SpeechProcessForMachineLearning用于机器学习的语音特征提取,包含FBank和MFCC等,原理讲解和step by step的实现
DataXujing / ASR Paper:fire: ASR教程: https://dataxujing.github.io/ASR-paper/
hangtingchen / MFCCC code to extract mfcc or fbank features from wav files
YUCHEN005 / RATS Channel A Speech DataThis is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log-Mel Fbank features and several raw wavform listening samples.
adam2go / MfccCalculate MFCC/Fbank feature for wav files
manyeyes / KaldiNativeFbankSharpc# wrapper for kaldi-native-fbank,used to extract audio features in speech recognition (ASR) task
QDPeng / Kaldi NDK FeatureNo description available
pengzhendong / Online FbankNo description available
manyeyes / SpeechFeaturesA C# library for extract audio features in speech recognition (ASR) task, support kaldi fbank
ZhongshuHou / Personalized Speech Enhancement DemoThis is a demo page of current ongoing personalized speech enhancement (pSE) project. The speaker embedding is generated through the fbank and mfcc features of enrollment speech.