Sound Transcriber

SoundTranscriber can be used to generate automatic transcription / automatic subtitles for audio/video files through a friendly graphical user interface. Developed by Mahmoud Atef, Ahmed Bakr, and Qais Alrefai from the TecWindow team.

Download:

download Sound Transcriber version 1.4.6.

Sound Transcriber User Guide.

This user guide aims to provide you with a comprehensive understanding of Sound Transcriber and help you make the most of its features.

We highly recommend reading this guide to ensure optimal usage of the program.

Introduction to Sound Transcriber:

Sound Transcriber is an accessible audio-to-text conversion program designed to transcribe audio and video files, it offers support for extracting subtitle files and more.

Developed by Mahmoud Atef, Ahmed Bakr, and Qais Alrefai from the TecWindow team.

Features:

Sound Transcriber offers the following features:

Conversion of audio and video files to text using various transcription services.
Saving conversion results as .txt and .doc files or as .SRT subtitle files.
Convert videos from platforms like YouTube, Facebook, and X by downloading the file directly, with an option to save the file for easy access and future use.

Planned features:

We have several planned features in the pipeline, including:

Translation of conversion results into multiple languages.
Spell check functionality with dictionaries.
Conversion of results into audio using text-to-speech engines.

Supported services:

The software currently only supports online conversion using Google's speech recognition, OpenAi's Whisper, and Meta's wit.ai.

Important notes:

Please take note of the following important information:

By default, the program uses the Google service. To utilize other services, you need to obtain an API key from the respective service provider.
For the best results with Arabic language transcription, we recommend using wit.ai.
OpenAI's Whisper has been included based on previous experiences, but it may not function as intended or may not work at all. We appreciate your feedback and experiences to help us resolve any issues.
Unlike OpenAI, Wit.ai provides free API keys.
When converting a file, ensure that you select the appropriate language before initiating the conversion process. If a file is multilingual, words in languages other than the selected language may not be accurately converted due to limitations of the services.
Prior to conversion, files are split into segments of up to 60 seconds, depending on the limitations of each service. Consequently, some words may be lost during this process. To achieve optimal results, we recommend adjusting the segment duration based on the silence time, file length, and segment duration allowed by each service.
For optimal results with .srt files, we recommend selecting a short duration, such as 5 seconds, for the segments.
Sound Transcriber requires Microsoft Visual C++ 2015-2022 Redistributable X64 and Microsoft Visual C++ 2013 Redistributable X64. If the program does not work for you, please use the previous links to download and install the required files.
Currently, Sound Transcriber supports the following languages: Arabic, English, Spanish, French, Russian, Turkish, And Vietnamese.
Sound Transcriber is compatible with Windows 8 and above, but only in 64-bit systems.

Supported file extensions:

Sound Transcriber supports the following file extensions for conversion:

.mp3, .wav, .aac, .flac, .oga, .opus, .mp4, .avi, .mkv, .mov, .m4a, .ogg, .ram, .rm, .wma, .wmv, .3gp, .flv.

Obtaining API Keys:

Wit.ai:

If we were to include an API key within the program itself, it would likely be blocked after widespread usage by multiple users. Moreover, wit.ai provides distinct API keys for each language. This means that you need to create an application in the desired language and obtain its corresponding API key. Unfortunately, it is not feasible for us to gather API keys for all languages since they vary based on individual preferences. Therefore, we will provide you with instructions on how to obtain your own private API key. Although the following steps may appear extensive, they are straightforward and only need to be completed once.

Open the wit.ai website
Log in with your Meta account by clicking "Continue with Meta."
Follow the normal account creation steps or press "Continue with Facebook"
Go to the top of the page, press h or 1 for screen reader users.
Click "New App" and give the app any English name, such as test or my app.
You will find an options box for the app's language, open it and choose the appropriate language. The audio files will be converted to the language you choose.
Finally click "Create".
Find the name of the application you created, click on it, and then navigate to the "Management" button.
When you reach this button, press the letter b to find a button called Settings. Press this button.
Move with the number 4 or h for screen reader users until you find a heading called Client Access Token, scroll down and you will find your key as a button. You can select the text manually to copy it or press the same button to for it to be automatically copied.
Return to Sound Transcriber and paste your API key into the provided field.

You can repeat these steps and create a new application with a different name to obtain an API key for transcribing in another language. If you want to use multiple languages with wit.ai, simply repeat the steps to obtain an API key for each language.

Whisper:

Sound Transcriber supports transcription through the use of OpenAI's Whisper API keys, which are not available for free.

The pricing is based on the number of characters transcribed, and specific plans are not mentioned here.

To get detailed information about the limits and subscription options, please visit this page/. Keep in mind that signing in and adding a payment method is at your own risk.

To obtain an API key for Sound Transcriber, go to the API key page and click on "Create new secret key." Copy the generated key and proceed to add it in the program settings as demonstrated later.

Sound Transcriber Interface:

Upon opening the program, you will find an edit box displaying the transcribed result. Use the tab key to navigate through the other options.

The "Language" box allows you to specify the language of the file you want to transcribe. Select the appropriate language using the arrow keys.

Click the "Start" button to initiate the conversion process.

Next, you will find the "Save As" button, which allows you to specify the output saving preferences.

Below that, there is a read-only edit box indicating the path or the link of the file to be transcribed.

Use the "Browse" button to locate and select the file you want to transcribe.

Additionally, you can utilize keyboard shortcuts, which will be explained later.

Please note that the order of items on the screen may differ when navigating with the Tab key.

Menus

The program includes several menus accessible by pressing the Alt key.

File:

Open: Use this option to browse your device and search for a file to transcribe.
Save: Save the transcription result with the specified extension in the settings.
Save As: Save the result with a specific extension.
Settings: Access program options and customizations.
Open Log File: allows you to view the Sound Transcriber log file.
Exit: Exit the program.

Services:

It contains the names of the services available for conversion, you can selecte any service.

Help:

User Guide: View the currently accessed file.
What's new: View Sound Transcriber change log.
Check for Updates: Search for program updates.
Contact Us: Display menus with options to contact the program developers.
Donate: donate to the developers of Sound Transcriber.
Open Repository: Opens Sound Transcriber repository on GitHub, the program is not open source.
About: Provides information about Sound Transcriber.

Sound Transcriber Settings:

Similar to the NVDA screen reader settings, the Sound Transcriber settings are categorized into several sections, each containing various options. You can navigate between sections using the up and down arrows. Use the Tab and Shift+Tab keys to scroll through the options within the selected section.

General:

This section includes various program-wide options:

Interface Language: Specify the program's language.
Service: Define the service used for file transcription.
Files to transcribe simultaneously: This feature allows you to specify the number of files to be transcribed simultaneously. This feature controls the number of clips from the same file that will be sent simultaneously after it is split.
Auto detect if there is a file in the clipboard: The program checks your clipboard on startup, and if it finds a supported file, it automatically selects its path for quick conversion.
Ask what to do when a file is detected in the clipboard: If enabled, the program will prompt you on how to handle the detected file.
Sounds: Activate alert sounds when the conversion starts and ends.
Speak Actions: Enable the screen reader to provide conversion status information.
Check for Updates Automatically: Automatically search for program updates upon startup.
Include beta versions when checking for updates: enables you to receive beta updates for the program. More details can be found in the beta updates section of this guide.
Enable Logging: This feature allows Sound Transcriber to write a log of the transcription process steps. It's par

SoundTranscriber

Install / Use

README