<div id="top"></div>    <div align="center"> <picture> <source srcset="docs/light/icon.png" width="128" height="128" media="(prefers-color-scheme: light)" /> <source srcset="docs/dark/icon.png" width="128" height="128" media="(prefers-color-scheme: dark)" /> <img src="docs/light/icon.png" alt="Logo" width="128" height="128"> </picture> <h1 align="center">Audiotext</h1> <p align="center">A desktop application that transcribes audio from files, microphone input or YouTube videos with the option to translate the content and create subtitles.</p> <p> <a href="https://github.com/HenestrosaDev/audiotext/actions/workflows/code-quality.yml"> <img src="https://github.com/HenestrosaDev/audiotext/actions/workflows/code-quality.yml/badge.svg" alt="Code Quality badge status" /> </a> <br> <a href="https://github.com/HenestrosaDev/audiotext/releases/latest"> <img src="https://img.shields.io/github/v/release/HenestrosaDev/audiotext" alt="Version" /> </a> <a href="https://github.com/HenestrosaDev/audiotext/stargazers"> <img src="https://img.shields.io/github/stars/HenestrosaDev/audiotext" alt="GitHub Contributors" /> </a> <a href="https://github.com/HenestrosaDev/audiotext/blob/main/LICENSE"> <img src="https://img.shields.io/badge/license-BSD--4--Clause-lightgray" alt="License" /> </a> <br> <a href="https://github.com/HenestrosaDev/audiotext/graphs/contributors"> <img src="https://img.shields.io/github/contributors/HenestrosaDev/audiotext" alt="GitHub Contributors" /> </a> <a href="https://github.com/HenestrosaDev/audiotext/issues"> <img src="https://img.shields.io/github/issues/HenestrosaDev/audiotext" alt="Issues" /> </a> <a href="https://github.com/HenestrosaDev/audiotext/pulls"> <img src="https://img.shields.io/github/issues-pr/HenestrosaDev/audiotext" alt="GitHub pull requests" /> </a> </p> <p> <a href="https://github.com/HenestrosaDev/audiotext/issues/new/choose"> Report Bug </a> · <a href="https://github.com/HenestrosaDev/audiotext/issues/new/choose"> Request Feature </a> · <a href="https://github.com/HenestrosaDev/audiotext/discussions"> Ask Question </a> </p> </div>

About the Project
Getting Started
Usage
Troubleshooting
Roadmap
Authors
Contributing
Acknowledgments
License
Support

About the Project

Main

Audiotext transcribes the audio from an audio file, video file, microphone input, directory, or YouTube video into any of the 99 different languages it supports. You can transcribe using the Google Speech-to-Text API, the Whisper API, or WhisperX. The last two methods can even translate the transcription or generate subtitles!

You can also choose the theme you like best. It can be dark, light, or the one configured in the system.

<details> <summary>Dark</summary> <img src="docs/dark/from-file.png" alt="Dark theme"> </details> <details> <summary>Light</summary> <img src="docs/light/from-file.png" alt="Light theme"> </details>

Supported Languages

<details> <summary>Click here to display</summary>

Afrikaans
Albanian
Amharic
Arabic
Armenian
Assamese
Azerbaijan
Bashkir
Basque
Belarusian
Bengali
Bosnian
Breton
Bulgarian
Burmese
Catalan
Chinese
Chinese (Yue)
Croatian
Czech
Danish
Dutch
English
Estonian
Faroese
Farsi
Finnish
French
Galician
Georgian
German
Greek
Gujarati
Haitian
Hausa
Hawaiian
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Javanese
Kannada
Kazakh
Khmer
Korean
Lao
Latin
Latvian
Lingala
Lithuanian
Luxembourgish
Macedonian
Malagasy
Malay
Malayalam
Maltese
Maori
Marathi
Mongolian
Nepali
Norwegian
Norwegian Nynorsk
Occitan
Pashto
Polish
Português
Punjabi
Romanian
Russian
Sanskrit
Serbian
Shona
Sindhi
Sinhala
Slovak
Slovenian
Somali
Spanish
Sundanese
Swahili
Swedish
Tagalog
Tajik
Tamil
Tatar
Telugu
Thai
Tibetan
Turkish
Turkmen
Ukrainian
Urdu
Uzbek
Vietnamese
Welsh
Yiddish
Yoruba

</details>

Supported File Types

<details> <summary>Audio file formats</summary>

.aac
.flac
.mp3
.mpeg
.oga
.ogg
.opus
.wav
.wma

</details> <details> <summary>Video file formats</summary>

.3g2
.3gp2
.3gp
.3gpp2
.3gpp
.asf
.avi
.f4a
.f4b
.f4v
.flv
.m4a
.m4b
.m4r
.m4v
.mkv
.mov
.mp4
.ogv
.ogx
.webm
.wmv

</details>

Project Structure

<details> <summary>ASCII folder structure</summary>

│   .gitignore
│   audiotext.spec
│   LICENSE
│   README.md
│   requirements.txt
│
├───.github
│   │   CONTRIBUTING.md
│   │   FUNDING.yml
│   │
│   ├───ISSUE_TEMPLATE
│   │       bug_report_template.md
│   │       feature_request_template.md
│   │
│   └───PULL_REQUEST_TEMPLATE
│           pull_request_template.md
│
├───docs/
│
├───res
│   ├───img
│   │       icon.ico
│   │
│   └───locales
│       │   main_controller.pot
│       │   main_window.pot
│       │
│       ├───en
│       │   └───LC_MESSAGES
│       │           app.mo
│       │           app.po
│       │           main_controller.po
│       │           main_window.po
│       │
│       └───es
│           └───LC_MESSAGES
│                   app.mo
│                   app.po
│                   main_controller.po
│                   main_window.po
│
└───src
    │   app.py
    │
    ├───controllers
    │       __init__.py
    │       main_controller.py
    │
    ├───handlers
    │       file_handler.py
    │       google_api_handler.py
    │       openai_api_handler.py
    │       whisperx_handler.py
    │       youtube_handler.py
    │
    ├───interfaces
    │       transcribable.py
    │
    ├───models
    │   │   __init__.py
    │   │   transcription.py
    │   │
    │   └───config
    │           __init__.py
    │           config_subtitles.py
    │           config_system.py
    │           config_transcription.py
    │           config_whisper_api.py
    │           config_whisperx.py
    │
    ├───utils
    │       __init__.py
    │       audio_utils.py
    │       config_manager.py
    │       constants.py
    │       dict_utils.py
    │       enums.py
    │       env_keys.py
    │       path_helper.py
    │
    └───views
        │   __init__.py
        │   main_window.py
        │
        └───custom_widgets
                __init__.py
                ctk_scrollable_dropdown/
                ctk_input_dialog.py

</details>

Built With

CTkScrollableDropdown for the scrollable option menu to display the full list of supported languages.
CustomTkinter for the GUI.
moviepy for video processing, from which the program extracts the audio to be t

Audiotext

Install / Use

README

Table of Contents

About the Project

Supported Languages

Supported File Types

Project Structure

Built With