N7speech
Manipuri ASR – A state-of-the-art, low-latency speech-to-text library with advanced voice activity detection and real-time transcription, purpose-built for the Manipuri language.
Install / Use
/learn @OmeshThokchom/N7speechREADME
N7Speech
<p align="center"> <img src="https://img.shields.io/badge/SOTA-Manipuri%20(Meiteilon)%20ASR-blueviolet?style=for-the-badge" alt="SOTA Manipuri ASR"/> </p>N7Speech is the State-of-the-Art (SOTA) Automatic Speech Recognition (ASR) model for Manipuri (Meiteilon).
It delivers highly accurate, real-time and file-based speech-to-text for Manipuri, supporting both Meitei Mayek and Latin phoneme outputs.
N7Speech is a Python library for real-time and file-based speech recognition and Meitei Mayek phoneme conversion.
It supports both microphone and audio file (wav/mp3) input, and can output either Meitei Mayek or Latin phoneme representations.
🚀 Why N7Speech?
- State-of-the-Art (SOTA) performance for Manipuri (Meiteilon) ASR
- Fast, accurate, and robust for both real-time and file-based transcription
- Supports both Meitei Mayek and Latin phoneme outputs
- Easy to use, cross-platform, and GPU-accelerated
Author
Dayananda Thokchom
Features
- Real-time speech recognition from microphone with VAD (voice activity detection)
- Transcription from audio files (wav/mp3)
- Meitei Mayek to phoneme (Latin) conversion
- Simple, high-level API
- ONNX model backend for fast inference
Installation
Linux/macOS
pip install n7speech
Or for local development:
git clone https://github.com/yourusername/N7speech.git
cd N7speech
pip install .
Windows
- Install Python 3.7+ from python.org.
- Open Command Prompt as Administrator.
- Install the package:
pip install n7speech
- If you encounter issues with
sounddevice, install the appropriate wheel from PyPI or use:
pip install pipwin
pipwin install sounddevice
GPU Acceleration (All Platforms)
Users can install either
onnxruntime(CPU) oronnxruntime-gpu(GPU) as needed.
Here, we specifyonnxruntimeas the default, but recommend for NVIDIA-GPU users to uninstall with
pip uninstall onnxruntime
and install
pip install onnxruntime-gpu
for much faster inference.
Usage
Real-time microphone transcription
from n7speech import RealTimeSpeech
RealTimeSpeech(lang="mni-latin").start(lambda t: print(f"\nResult: {t}"))
Transcribe from audio file
from n7speech import speech_from_file
result = speech_from_file("your_audio.wav", lang="mni-latin")
print(result)
lang="mni"for Meitei Mayek output,lang="mni-latin"for phoneme output.
Platform Support
N7Speech is cross-platform and works on Linux, macOS, and Windows.
All dependencies (onnxruntime, torch, numpy, librosa, sounddevice) are available for these operating systems.
- For macOS and Windows users, make sure your Python environment and audio drivers are set up correctly for
sounddeviceandtorch. - For GPU acceleration, ensure you install the correct version of
onnxruntime-gpuand have compatible CUDA drivers (on supported hardware).
Requirements
- Python 3.7+
- onnxruntime or onnxruntime-gpu (for GPU acceleration, highly recommended for fast transcription; e.g., 20s wav in ~110ms)
- numpy
- librosa
- torch
- sounddevice
Model and Vocab
Place your ONNX model as model.onnx and vocabulary as vocab.txt in the working directory.
License
MIT License
