Gasr

Google Chrome SODA Offline Speech Recognition command line client

Generate Convert Improve

Install / Use

/learn @biemster/Gasr

About this skill

Quality Score

0/100

README

gasr

ChromeOS SODA Offline Speech Recognition command line client

Intro:

This is a proof of concept how to write code against the libsoda library found in the ChromeOS, which uses it for Live Transcribe. It's not a full application, but it will write out a live transcription to stdout of audio fed through stdin using for example ALSA. Previous versions used the library found in the Chrome browser, and this is still available in the chrome-browser branch although not actively maintained. Since ChromeOS is linux under the hood, Windows is not supported anymore and users requiring this should use the chrome-browser branch.

Prepare:

Use the prep.py script to download the library for your platform, the language model of your choosing and patch the dynamic linker to accept the relative relocations used in libsoda.so. Some examples are:

./prep.py -c # check if the dynamic linker is copied, patched and ready

./prep.py -c -p hana # check if libsoda.so for RPi4 is downloaded, fixed and ready

./prep.py -c -p hana -l "en-us" # check if libsoda.so for RPi4 is ready, and the en-us model

./prep.py -s -p hana # setup the ld-linux interpreter and libsoda.so for RPi4

If your platform is not available, please respond in https://github.com/biemster/gasr/issues/24

Run:

arecord -Dplughw:3,0 -fS16_LE -c1 -r16000 | ./gasr.py 2>/dev/null

where hw:3,0 should be changed to where your microphone lives in your ALSA setup.

Related Skills

node-connect

343.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

92.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。