Audio2text

A google cloud function to transcribe audio

Generate Convert Improve

Install / Use

/learn @jackzampolin/Audio2text

About this skill

Quality Score

0/100

README

NOT WORKING: This program is currently not working. I've opened an issue on @google-cloud/speech the node repo detailing issue and

Serverless audio to text

This is a test to turn this tutorial on Optical Character Recognition with server-less into a speech to text converter.

General application flow

Prepare audio file according to notes
Define cloud function on staging bucket (name: gs://influx-staging-bucket)
Upload .flac file into gcloud bucket (name: gs://influx-audio-upload)
Server-less function kicks off, pulls audio file, calls speech API
Function stores converted text into another bucket (name gs://influx-text-out)

Notes

To find sample rate, number of channels mediainfo myfile.flac
To convert between formats use the fre:ac program.
To remove an extraneous channel use the audacity program.

Next steps

Currently program outputs transcription to stdout, need to modify to store text in a file on cloud storage
Currently you need to manually set file type and sampleRateHertz. Make function identify this from tags or other from uploaded file

Related Skills

node-connect

344.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

99.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。