68 skills found · Page 1 of 3
scambier / Obsidian Text ExtractorA (companion) plugin to facilitate the extraction of text from images (OCR) and PDFs.
Ronin-CK / QuickSnip⚡ Lightweight Wayland OCR & Google Lens utility built with Quickshell.
docwire / DocwireDocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality
felipeochoa / MinecartSimple, Pythonic extraction of text, shapes and images from PDFs
AmanSavaria1402 / TableNetTableNet: Deep Learning model for end-to-end Table Detection and Tabular data extraction from Scanned Data Images In modern times, more and more number of people are sharing their documents as photos taken from smartphones. A lot of these documents contain lots of information in one or more tables. These tables often contain very important information and extracting this information from the image is a task of utmost importance. In modern times, information extraction from these tables is done manually, which requires a lot of effort and time and hence is very inefficient. Therefore, having an end-to-end system that given only the document image, can recognize and localize the tabular region and also recognizing the table structure (columns) and then extract the textual information from the tabular region automatically will be of great help since it will make our work easier and much faster. TableNet is just that. It is an end-to-end deep learning model that can localize the tabular region in a document image, understand the table structure and extract text data from it given only the document image. Earlier state-of-the-art deep learning methods took the two problems, that is, table detection and table structure recognition (recognizing rows and columns in the table) as separate and treated them separately. However, given the interdependence of the two tasks, TableNet considers them as two related sub-problems and solves them using a single neural network. Thus, also making it relatively lightweight and less compute intensive solution.
mohaps / XtractorXTractor is an algorithmic text extractor from web pages written in Java. It builds upon the "commonly used web design practices" approach (from readability.js, goose and snacktory) to create a set of heuristics for fast article text extraction. It adds several features like paragraph preservation, better image detection heuristics, sibling score based enhancements to article detection
rootiest / Obsidian AI Image OcrObsidian plugin for AI-powered text extraction from images
mathigatti / Img2txtEasy formatted text extraction from images using Google Vision API
arshad-yaseen / Ocr Llm⚡️ Fast, ultra-accurate text extraction from any image or PDF—including challenging ones—with structured markdown output powered by vision models.
Deepshikha05 / Text Extraction From ImageExtracts Text out of Images using OpenCV and Pytesseract
SerdarHelli / MRZ Passport Reader From ImageMRZ Passport Reader from Image is a Python-based tool that automatically detects, segments, and extracts text from the Machine-Readable Zone (MRZ) of passport images. Utilizing deep learning models for segmentation and face detection, alongside EasyOCR for text recognition, it ensures accurate and efficient MRZ data extraction.
Praneet9 / DocifyA service for extracting text from ID cards in India, like Aadhar Card, PAN Card and Driving Licence. You just need to click and send a picture of the card to the API and get a json with your details. It was built using Flask, Deep Learning and Image Processing. It also uses Connectionist Text Proposal Network (Open Source) along with Tesseract for text extraction.
viky08 / Optical Character Recognition Text Extraction From Images A Python application based on Machine learning and Deep learning that detects text/sentences in an image. (Using CNN in Keras Framework and OpenCV).
EmilHvitfeldt / Quickpalette🏃♀️🎨 R package for quick extraction of color palettes from text and images
edwineas / Ubuntu Text CaptureUbuntu Text Capture is a Python tool that captures a selected area of the screen, extracts text using Tesseract OCR, and copies it to the clipboard. It includes a customizable GNOME keyboard shortcut (Shift + Ctrl + T) for quick activation, making text extraction from images fast and easy.
VMD7 / Automate Identification And Recognition Of Handwritten Text From An ImageThis project offers an efficient method for identifying and recognizing handwritten text from images. Using a Convolutional Recurrent Neural Network (CRNN) for Optical Character Recognition (OCR), it effectively extracts text from images, aiding in the digitization of handwritten documents and automated text extraction.
vinodbaste / PaddleOCR Rec DecOptical Character Recognition (OCR) is a powerful technology that enables machines to recognize and extract text from images or scanned documents. OCR finds applications in various fields, including document digitization, text extraction from images, and text-based data analysis.
vivaneiona / Genkit UnstructConcurrent data extraction from unstructured text and images using AI models.
Priyansu-Bhandari / EasyOCR ProjectText extraction from images uses the EasyOCR library to extract text from images containing English and Hindi characters.
shubham99bisht / Expense TrackerKey Information Extraction from Scanned Receipts: The aim of this project is to extract texts of a number of key fields from given receipts, and save the texts for each receipt image in a JSON file.