OCR4Linux
OCR CLI Tool for Extracting Text from Screenshots (images) using bash, and python scripts for both x11 and wayland
Install / Use
/learn @moheladwy/OCR4LinuxREADME
OCR4Linux
Version: 1.5.0
OCR4Linux is a versatile text extraction tool that allows you to take a screenshot of a selected area, extract text using OCR, and copy it to the clipboard. It supports both Wayland and X11 sessions and offers multiple language support.
Note: This script is currently only made for Arch Linux. It may work on other arch-based distributions, but it has not been tested yet.
Motivation
I didn't find any easy tool in Linux that does the same thing as the PowerToys app in Windows. This motivated me to create OCR4Linux, a simple and efficient tool to capture screenshots, extract text, and copy it to the clipboard, all in one seamless process.
Features
-
Screenshot Capture
- Wayland support via
grimblast - X11 support via
scrot - Configurable screenshot directory
- Wayland support via
-
Text Extraction
- Interactive language selection via
rofi - Multi-language OCR support with custom language combinations
- Automatic language detection fallback
- Image preprocessing for better accuracy
- UTF-8 text output
- Interactive language selection via
-
Clipboard Integration
- Wayland:
wl-copyandcliphist - X11:
xclip
- Wayland:
-
Additional Features
- Interactive language selection menu
- Optional screenshot retention
- Comprehensive logging system
- Command-line interface
Requirements
System Requirements
-
Arch Linux or arch-based distribution
-
Python 3.x
-
yaypackage manager (will be installed if needed) -
tesseractOCR engine -
tesseract-data-engEnglish language pack -
tesseract-data-araArabic language pack -
rofifor the interactive language selection feature. -
If you need any other language other than the above two, search for it using the command:
sudo pacman -Ss tesseract-data-{lang}
Python Dependencies
python-pillowpython-pytesseract
Session-Specific Requirements
- Wayland:
grimblast-gitwl-clipboardcliphist
- X11:
scrotxclip
Installation
Option 1: Install from AUR (Recommended)
The easiest way to install OCR4Linux on Arch Linux or any Arch-based distribution is directly from the AUR using any AUR helper (e.g., yay, paru):
yay -S ocr4linux-git
This will automatically install OCR4Linux and all its required dependencies.
Option 2: Build from Source (makepkg)
You can clone the repository and build the package manually using makepkg:
-
Clone the repository:
git clone https://github.com/moheladwy/OCR4Linux.git cd OCR4Linux -
Build and install the package:
makepkg -si
Option 3: Manual Installation (setup.sh)
If you prefer a local installation in your home directory or want to use the automated setup script:
-
Clone the repository:
git clone https://github.com/moheladwy/OCR4Linux.git cd OCR4Linux -
Run the setup script:
chmod +x setup.sh ./setup.shNote: The setup script will:
- Prompt you to confirm before proceeding with the manual installation
- Install all required dependencies (tesseract, rofi, screenshot tools, etc.)
- Copy all OCR4Linux files to
~/.config/OCR4Linux/ - Set up the necessary directory structure
Usage
-
Run the tool to take a screenshot, extract text, and copy it to the clipboard:
If installed via AUR or
makepkg:OCR4LinuxIf installed via
setup.sh:~/.config/OCR4Linux/OCR4Linux.shOr if you're in the source directory:
./OCR4Linux.sh -
The script will:
- With
--langoption: Use specified languages directly (bypasses rofi menu) - Without
--langoption: Display an interactive language selection menu viarofi - Allow you to select one or multiple languages for OCR processing
- Take a screenshot of the selected area after language selection
- Extract text from the image using the selected languages
- Copy the extracted text to the clipboard
- With
Language Selection
You have two options for language selection:
Option 1: Command Line (Direct)
Specify languages directly using the --lang option:
--lang all- Use all available languages--lang eng- Use English only--lang eng+ara+fra- Use multiple specific languages
Option 2: Interactive Menu (Rofi)
When you run the script without --lang, a rofi menu will appear with:
- ALL: Select all available languages
- Individual languages: Choose specific languages (e.g., eng, ara, fra, deu)
- Multi-select: Hold
Ctrland click to select multiple languages
The selected languages will be used by Tesseract for more accurate text recognition in multi-language documents.
Workflow
The complete OCR4Linux workflow:
- Language Selection:
- Command-line specified languages (with
--lang) OR - Interactive rofi menu displays available languages (without
--lang)
- Command-line specified languages (with
- Language Processing: Selected languages are validated and formatted
- Screenshot Capture: Area selection and image capture
- OCR Processing: Text extraction using selected languages
- Clipboard Integration: Extracted text copied to system clipboard
- Cleanup: Optional screenshot removal and logging
Command Line Arguments
OCR4Linux.sh
| Option | Description | Default |
| ------------------ | ------------------------------------- | ---------------------------- |
| -r | Remove screenshot after processing | false |
| -d DIR | Set screenshot directory | $HOME/Pictures/screenshots |
| -l | Keep logs | false |
| -n, --notify | Show notification after screenshot | false |
| --lang LANGUAGES | Specify OCR languages (bypasses rofi) | Interactive selection |
| -v, --version | Print the package version, then exit | - |
| -h, --help | Show help message, then exit | - |
Language Format for --lang:
- Use
allfor all available languages - Use
+to separate multiple languages (e.g.,eng+ara+fra) - Single languages:
eng,ara,fra, etc.
OCR4Linux.py
| Option | Description | Required |
| --------------------- | ---------------------------- | -------- |
| image_path | Path to input image | Yes |
| output_path | Path to save extracted text | Yes |
| --langs <languages> | Specify languages for OCR | No |
| -l, --list-langs | List available OCR languages | No |
| -h, --help | Show help message | No |
Language Format: Use + to separate multiple languages (e.g., eng+ara+fra)
Examples
Using OCR4Linux
# Basic usage (shows interactive rofi menu)
OCR4Linux
# Direct language specification (bypasses rofi)
OCR4Linux --lang eng
OCR4Linux --lang all
OCR4Linux --lang eng+ara+fra
# Save logs and remove screenshot after processing
OCR4Linux -l -r
# Custom screenshot directory with logging and notification
OCR4Linux -d ~/Documents/screenshots -l -n
# Combine language specification with other options
OCR4Linux --lang eng -l -r
OCR4Linux --lang all -d ~/screenshots -l
# Print version
OCR4Linux -v
# Show help
OCR4Linux -h
Note: If you are running the script manually without installation, replace OCR4Linux with ./OCR4Linux.sh.
Using OCR4Linux.py
# Basic usage (uses all available languages)
python OCR4Linux.py input.png output.txt
# Specify single language
python OCR4Linux.py input.png output.txt --langs eng
# Specify multiple languages
python OCR4Linux.py input.png output.txt --langs eng+ara+fra
# List available languages
python OCR4Linux.py --list-langs
# Show help
python OCR4Linux.py --help
Tips
-
Language Selection Options:
-
Command Line: Use
--langfor automated/scripted usage--lang allfor maximum compatibility--lang engfor English-only documents--lang eng+arafor bilingual documents
-
Interactive Menu: Run without
--langfor manual selection- Select "ALL" to use all available languages
- Select specific languages for better performance
- Use
Ctrl+Clickto select multiple languages - Press
Escapeto cancel the operation
-
-
Performance Optimization:
- Use fewer specific languages for faster processing
- Use
--lang allonly when document language is unknown - Command-line specification is faster than interactive selection
-
Keyboard Shortcuts: You can create a keyboard shortcut to run the script for easy access.
Example for
Hyprlandusers:-
put the following lines in your
hyprland.conffile:# If installed via AUR/makepkg bind = $mainMod SHIFT, E, exec, OCR4Linux # OCR4Linux with interactive selection bind = $mainMod SHIFT, T, exec, OCR4Linux --lang eng # OCR4Linux with English only # If installed via setup.sh # bind = $mainMod SHIFT, E, exec, ~/.config/OCR4Linux/OCR4Linux.sh
Example for
dwmusers:-
put the following lines in your
config.hfile:/* If installed via AUR/makepkg */ static const char *ocr4linux[] = { "OCR4Linux", NULL }; static const char *ocr4linux_eng[] = { "OCR4Linux", "--lang", "eng", NULL }; { MODKEY | ShiftMask, XK_e, spawn, {.v = ocr4linux } }, // OCR4Linux interactive { MODKEY | ShiftMask, XK_t, spawn, {.v = ocr4l
-
