OcrTranslator
Convert captured images to text using BaiduOCR, GoogleOCR, WindowsOCR, tesseractOCR, RapidOCR or Capture2Text, and translate the resulting text using Google, Chatgpt, Edgegpt, DeepL or many more. Desktop application with a nice GUI provided by customtkinter.
Install / Use
/learn @Azornes/OcrTranslatorREADME
With this app, you can select your preferred OCR and translation services. After clicking on START or using the keyboard shortcut Alt+Win+T, the program will launch and you can choose the area of the screen to scan for text using OCR. If you have selected a translation service, the text will then be automatically translated.
preview:
https://user-images.githubusercontent.com/20650591/233107070-f9a14ed8-5c77-4947-8fa5-8d1c86d4a04f.mp4
🔥 Features
- Desktop application with a user-friendly graphical user interface (GUI) provided by customtkinter.
- Ability to select preferred OCR and translation services.
- Option to run the program using either the START button or the keyboard shortcut (Alt+Win+T or bound from options).
- Capability to choose the area of the screen to scan for text using OCR and save the position (for example, when watching a movie and the subtitles always appear in one spot, so you don't have to select the text area again).
- Automatic translation of the captured text if a translation service has been selected.
- Ability to capture subtitles from movies or games by selecting the corresponding area of the screen and displaying the translated text next to them.
- Chat with chatGPT or edgeGPT.
- Ability to translate from the clipboard or manually entered text (similar to a typical translation app).
- Save all selected options and settings to a file and load them when the program is launched.
Desktop App
Download the desktop app here Tested only on Windows 10.
Dependency
- Python 3.9. (If you want run from source)
- (optional) Capture2Text.
- (optional) Tesseract.
- (optional) Google api generate a service_account_creds.json. Then, put file into the
ocrTranslate/configsdirectory.
5. (optional) ChatGPT
</summary>Source
Configuration
- Create account on OpenAI's ChatGPT
- Save your email and password
Authentication method: (Choose 1 and paste to app settings)
- Email/Password
Currently broken for free users. Do
export PUID="..."if you have a plus account. The PUID is a cookie named_puidNot supported for Google/Microsoft accounts.
- Access token
https://chat.openai.com/api/auth/session
</details> <details> <summary>6. (optional) EdgeGPT
</summary> <details> <summary>Source
Checking access (Required)
</summary>- Install the latest version of Microsoft Edge
- Alternatively, you can use any browser and set the user-agent to look like you're using Edge (e.g.,
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36 Edg/111.0.1661.51). You can do this easily with an extension like "User-Agent Switcher and Manager" for Chrome and Firefox. - Open bing.com/chat
- If you see a chat feature, you are good to go
Getting authentication (Required)
</summary>- Install the cookie editor extension for Chrome or Firefox
- Go to
bing.com - Open the extension
- Click "Export" on the bottom right, then "Export as JSON" (This saves your cookies to clipboard)
- Paste your cookies into a file
cookies.json - Paste your file
cookies.jsontoocrTranslate/configs/
📊 Tables with information
<details> <summary>Supported OCR Services
</summary>| ID | OCR | Internet/Local | Status | |-----|-------------------------------------------------------------------------------------------------------|----------------|--------| | 1 | Google Vision Api | Internet | stable | | 2 | Google Vision Free Demo | Internet | stable | | 3 | Baidu Api | Internet | stable | | 4 | Windows OCR | Local | stable | | 5 | Capture2Text | Local | stable | | 6 | Tesseract | Local | stable | | 7 | RapidOCR | Local | stable |
</details> <details> <summary>Supported Translation Services
</summary>Source
| ID | Translator | Number of Supported Languages | Advantage | Service | Status | |-----|-----------------------------------------------------------------------------------|-------------------------------|---------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------|---------------------------------| | 1 | Niutrans | 302 | support the most languages in the world | Northeastern University / Niutrans, China | / | | 2 | Alibaba | 221 | support most languages, support professional field | Alibaba, China | stable | | 3 | Baidu | 201 | support most languages, support professional field, support Classical Chinese | Baidu, China | stable | | 4 | Iciba | 187 | support the most languages in the world | Kingsoft / Xiaomi, China | stable | | 5 | MyMemory | 151 | support the most languages in the world, good at Creole English, Creole French | Translated, Italy | stable | | 6 | Iflytek | 140 | support the most languages in the world | Iflytek, China | / | | 7 | Google | 134 | support more languages in the world | Google, America | stable(offline in China inland) | | 8 | VolcEngine | 122 | support more languages in the world, support professional field | ByteDance, China | / | | 9 | [Lingvanex](https
Related Skills
node-connect
347.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
108.4kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
108.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
347.6kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
