Openlrc

Transcribe and translate voice into LRC file using Whisper and LLMs (GPT, Claude, et,al). 使用whisper和LLM(GPT，Claude等)来转录、翻译你的音频为字幕文件。

Generate Convert Improve

Install / Use

/learn @zh-plus/Openlrc

About this skill

Quality Score

0/100

README

Open-Lyrics

GitHub Workflow Status (with event)

Open-Lyrics is a Python library that transcribes audio with faster-whisper, then translates/polishes the text into .lrc subtitles with LLMs such as OpenAI and Anthropic.

Key Features

Audio preprocessing to reduce hallucinations (loudness normalization and optional noise suppression).
Context-aware translation to improve translation quality. Check prompt for details.
Check here for an overview of the architecture.

New 🚨

2024.5.7:

Added custom endpoint (base_url) support for OpenAI and Anthropic:

lrcer = LRCer(
    translation=TranslationConfig(
        base_url_config={'openai': 'https://api.chatanywhere.tech',
                         'anthropic': 'https://example/api'}
    )
)

Added bilingual subtitle generation:

lrcer.run('./data/test.mp3', target_lang='zh-cn', bilingual_sub=True)

2024.5.11: Added glossary support in prompts to improve domain-specific translation. Check here for details.

2024.5.17: You can route models to arbitrary chatbot SDKs (OpenAI or Anthropic) by setting chatbot_model to provider: model_name together with base_url_config:

lrcer = LRCer(
    translation=TranslationConfig(
        chatbot_model='openai: claude-3-haiku-20240307',
        base_url_config={'openai': 'https://api.g4f.icu/v1/'}
    )
)

2024.6.25: Added Gemini as a translation model (for example, gemini-1.5-flash):

lrcer = LRCer(translation=TranslationConfig(chatbot_model='gemini-1.5-flash'))

2024.9.10: Now openlrc depends on a specific commit of faster-whisper, which is not published on PyPI. Install it from source:

pip install "faster-whisper @ https://github.com/SYSTRAN/faster-whisper/archive/8327d8cc647266ed66f6cd878cf97eccface7351.tar.gz"

2024.12.19: Added ModelConfig for model routing. It is more flexible than plain model-name strings. ModelConfig can be ModelConfig(provider='<provider>', name='<model-name>', base_url='<url>', proxy='<proxy>'), e.g.:


from openlrc import LRCer, TranslationConfig, ModelConfig, ModelProvider

chatbot_model1 = ModelConfig(
    provider=ModelProvider.OPENAI, 
    name='deepseek-chat', 
    base_url='https://api.deepseek.com/beta', 
    api_key='sk-APIKEY'
)
chatbot_model2 = ModelConfig(
    provider=ModelProvider.OPENAI, 
    name='gpt-4o-mini', 
    api_key='sk-APIKEY'
)
lrcer = LRCer(translation=TranslationConfig(chatbot_model=chatbot_model1, retry_model=chatbot_model2))

Installation ⚙️

Install CUDA 11.x and cuDNN 8 for CUDA 11 first according to https://opennmt.net/CTranslate2/installation.html to enable faster-whisper.

faster-whisper also needs cuBLAS for CUDA 11 installed.
<details> <summary>For Windows Users (click to expand)</summary>
(Windows only) You can download the libraries from Purfview's repository:

Purfview's whisper-standalone-win provides the required NVIDIA libraries for Windows in a single archive. Decompress the archive and place the libraries in a directory included in the PATH.
</details>
Add LLM API keys (recommended for most users: OPENROUTER_API_KEY):
- Add your OpenAI API key to environment variable OPENAI_API_KEY.
- Add your Anthropic API key to environment variable ANTHROPIC_API_KEY.
- Add your Google API Key to environment variable GOOGLE_API_KEY.
- Add your OpenRouter API key to environment variable OPENROUTER_API_KEY.
Install ffmpeg and add bin directory to your PATH.

This project can be installed from PyPI:

pip install openlrc

or install directly from GitHub:

pip install git+https://github.com/zh-plus/openlrc

Install the latest faster-whisper from source:

pip install "faster-whisper @ https://github.com/SYSTRAN/faster-whisper/archive/8327d8cc647266ed66f6cd878cf97eccface7351.tar.gz"

Install PyTorch:

pip install --force-reinstall torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

Fix the typing-extensions issue:
```
pip install typing-extensions -U
```

Lightweight Imports

OpenLRC keeps several package-root APIs lightweight to import.

The following imports are guaranteed not to eagerly load heavyweight runtime dependencies such as torch, spacy, faster-whisper, tiktoken, or lingua:

import openlrc
from openlrc import LRCer
from openlrc import TranscriptionConfig, TranslationConfig
from openlrc import ModelConfig, ModelProvider, list_chatbot_models

This is useful when you only need configuration objects, model metadata, or the LRCer type itself without immediately starting transcription or language-processing work.

Heavy dependencies are loaded only when the corresponding features are first used. For example:

faster-whisper is loaded when transcription is first needed.
torch and df.enhance are loaded when noise suppression is used.
spacy is loaded when sentence segmentation or related NLP helpers are used.
tiktoken is loaded when token counting is used.
lingua is loaded when language detection helpers are used.

[!NOTE] Lightweight imports improve import-time behavior only. They do not change installation requirements: pip install openlrc still installs the full dependency set declared by the package.

Usage 🐍

Python code

import os
from openlrc import LRCer, TranscriptionConfig, TranslationConfig, ModelConfig, ModelProvider

if __name__ == '__main__':
    lrcer = LRCer()

    # Single file
    lrcer.run('./data/test.mp3',
              target_lang='zh-cn')  # Generate translated ./data/test.lrc with default translate prompt.

    # Multiple files
    lrcer.run(['./data/test1.mp3', './data/test2.mp3'], target_lang='zh-cn')
    # Note we run the transcription sequentially, but run the translation concurrently for each file.

    # Path can contain video
    lrcer.run(['./data/test_audio.mp3', './data/test_video.mp4'], target_lang='zh-cn')
    # Generate translated ./data/test_audio.lrc and ./data/test_video.srt

    # Use glossary to improve translation
    lrcer = LRCer(translation=TranslationConfig(glossary='./data/aoe4-glossary.yaml'))

    # To skip translation process
    lrcer.run('./data/test.mp3', target_lang='en', skip_trans=True)

    # Change asr_options or vad_options (see openlrc.defaults for details)
    vad_options = {"threshold": 0.1}
    lrcer = LRCer(transcription=TranscriptionConfig(vad_options=vad_options))
    lrcer.run('./data/test.mp3', target_lang='zh-cn')

    # Enhance the audio using noise suppression (consume more time).
    lrcer.run('./data/test.mp3', target_lang='zh-cn', noise_suppress=True)

    # Change the translation model
    lrcer = LRCer(translation=TranslationConfig(chatbot_model='claude-3-sonnet-20240229'))
    lrcer.run('./data/test.mp3', target_lang='zh-cn')

    # Clear temp folder after processing done
    lrcer.run('./data/test.mp3', target_lang='zh-cn', clear_temp=True)

    # Use OpenRouter via ModelConfig (custom base_url + routed model name)
    openrouter_model = ModelConfig(
        provider=ModelProvider.OPENAI,
        name='anthropic/claude-3.5-haiku',
        base_url='https://openrouter.ai/api/v1',
        api_key=os.getenv('OPENROUTER_API_KEY')
    )
    fallback_model = ModelConfig(
        provider=ModelProvider.OPENAI,
        name='openai/gpt-4.1-nano',
        base_url='https://openrouter.ai/api/v1',
        api_key=os.getenv('OPENROUTER_API_KEY')
    )
    lrcer = LRCer(
        translation=TranslationConfig(chatbot_model=openrouter_model, retry_model=fallback_model)
    )

    # Bilingual subtitle
    lrcer.run('./data/test.mp3', target_lang='zh-cn', bilingual_sub=True)

Check more details in Documentation.

Glossary

Add glossary to improve domain specific translation. For example aoe4-glossary.yaml:

{
  "aoe4": "帝国时代4",
  "feudal": "封建时代",
  "2TC": "双TC",
  "English": "英格兰文明",
  "scout": "侦察兵"
}

lrcer = LRCer(translation=TranslationConfig(glossary='./data/aoe4-glossary.yaml'))
lrcer.ru

Related Skills

node-connect

335.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

claude-opus-4-5-migration

82.5k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

frontend-design

82.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

model-usage

335.2k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

zh-plus

View profile

View on GitHub

GitHub Stars641

CategoryDevelopment

Updated9h ago

Forks49

zh-plus/openlrc

Languages

Python

Security Score

100/100

Audited on Mar 25, 2026

No findings