29 skills found
facebookresearch / DenoiserReal Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
galgreshler / Catch A WaveformOfficial pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)
YijianZhou / PALMAn Earthquake Detection and Location Architecture for Continuous Seismograms: Phase Picking, Association, Location, and Matched Filter
philipperemy / Very Deep Convnets Raw WaveformsTensorflow - Very Deep Convolutional Neural Networks For Raw Waveforms - https://arxiv.org/pdf/1610.00087.pdf
kyungyunlee / SampleCNN PytorchPytorch implementation of "Sample-level Deep Convolutional Neural Networks for Music Auto-tagging Using Raw Waveforms"
tae-jun / ResemulA TensorFlow+Keras implementation of "Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms"
tae-jun / Sample CnnA TensorFlow implementation of "Sample-level Deep Convolutional Neural Networks for Music Auto-tagging Using Raw Waveforms"
RBenita / DIFFARDenoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
wngh1187 / RawNeXtPytorch implementation of RawNeXt: Speaker verification system for variable-duration utterance with deep layer aggregation and dynamic scaling policies
labhamlet / WavjepaThis is the official codebase for WavJEPA. Time-domain audio foundation model for holistic downstream tasks. "Self-supervised learning from raw waveforms unlock robust audio foundation models"."
dogeplusplus / Cat AlanAudio classification of domestic cats sounds (hungry, angry, purring, etc) using raw waveforms. PyTorch implementation of M5 architecture.
Anwarvic / CNN For Raw WaveformsThis is my PyTorch implementation of the "Very Deep Convolutional Neural Networks For Raw Waveforms" research paper published in 2016.
PrithivirajManiram / Robotic EXoskeleton For Arm Rehabilitation REXAR Rehabilitation of people afflicted with elbow joint ailments is quite challenging. Studies reveal that rehabilitation through robotic devices exhibits promising results, in particular exoskeleton robots. In this work, 1 degree of freedom active upper-limb exoskeleton robot with artificial intelligence aided myoelectric control system has been developed for elbow joint rehabilitation. The raw surface electromyogram (sEMG) signals from seventeen different subjects for five different elbow joint angles were acquired using the Myo armband. Time-domain statistical features such as waveform length, root mean square, variance, and a number of zero crossings were extracted and the most advantageous feature was investigated for Artificial Neural Network (ANN) – a backpropagation neural network with Levenberg-Marquardt training algorithm and Support Vector Machine (SVM) – with Gaussian kernel. The results show that waveform length consumes the least amount of computation time. With waveform length as an input feature, ANN and SVM exhibited an average overall classification accuracy of 91.33% and 91.03% respectively. Moreover, SVM consumed 36% more time than ANN or classification.
gayanechilingar / Change EmotionsNonparallel Emotional Speech Conversion with MUNIT. Introduction: This is a tensorflow implementation of paper(https://arxiv.org/pdf/1811.01174.pdf) Nonparallel Emotional Speech Conversion. It is an end-to-end voice conversion system which can change the speaker's emotion. For example, neutral to angry, sad to happy. The model aims at generating speech with desired emotions while keeping the original linguistic content and speaker identity. It first extracts acoustic features from raw audio, then learn the mapping from source emotion to target emotion in the feature space, and finally put those features together to rebuild the waveform. In our approach, three types of features are considered: Features: Fundamental frequency (log F_0), converted by logarithm Gaussian normalized transformation Power envelope, converted by logarithm Gaussian normalized transformation Mel-cepstral coefficients (MCEPs), a representation of spectral envelope, trained by CycleGAN Aperiodicities (APs), directly used without modification. Dependencies: Python 3.5, Numpy 1.15, TensorFlow 1.8, LibROSA 0.6, FFmpeg 4.0, PyWorld
jfainberg / Sincnet AdaptRaw waveform adaptation with SincNet
adipiz99 / RawNetLiteImplementation of the paper "RawNetLite: Lightweight End-to-End Audio Deepfake Detection"
aminul-huq / Speech Command ClassificationSpeech command classification on Speech-Command v0.02 dataset using PyTorch and torchaudio. In this example, three models have been trained using the raw signal waveforms, MFCC features and MelSpectogram features.
magnumresearchgroup / AuxiliaryRawNetAuxiliary Raw Net (ARawNet) is a ASVSpoof detection model taking both raw waveform and handcrafted features as inputs, to balance the trade-off between performance and model complexity.
WeiqiangLiIEEC / Gnssr Spaceborne Cwf ProcessingA processing tool for the spaceborne GNSS-R complex waveform products available at IEEC's GOLD-RTR server. The project provides an example for accessing, searching and analyzing the compelx waveform product derived from the GNSS-R raw IF data collected by different spaceborne missions (e.g. TDS-1, CYGNSS, BuFeng-1, SPIRE).
yongxuUSTC / Convolutional Autoencoder For Raw Waveform Reconstructionconvolutional autoencoder for raw waveform reconstruction to replace the classic STFT, i called it as short-time AE transform (STAET)