Docsifer

Docsifer is a powerful tool for converting various data formats into Markdown for applications such as indexing, text analysis, and more. It supports PDF, PowerPoint, Word, Excel, Images, Audio, HTML, and other text-based formats, and leverages LLMs to enhance performance.

Generate Convert Improve

Install / Use

/learn @lh0x00/Docsifer

About this skill

Quality Score

0/100

README

title: Docsifer emoji: 👻 / 📚 colorFrom: green colorTo: indigo sdk: docker app_file: app.py pinned: false

📄 Docsifer: Efficient Data Conversion to Markdown

Docsifer is a powerful FastAPI + Gradio service for converting various data formats (PDF, PowerPoint, Word, Excel, Images, Audio, HTML, etc.) to Markdown. It leverages the MarkItDown library and can optionally use LLMs (via OpenAI) for richer extraction (OCR, speech-to-text, etc.).

✨ Key Features

Comprehensive Format Support:
- PDF: Extracts text and structure effectively.
- PowerPoint: Converts slides into Markdown-friendly content.
- Word: Processes .docx files with precision.
- Excel: Extracts tabular data as Markdown tables.
- Images: Reads EXIF metadata and applies OCR for text extraction.
- Audio: Retrieves EXIF metadata and performs speech transcription.
- HTML: Transforms web pages into Markdown.
- Text-Based Formats: Handles CSV, JSON, XML with ease.
- ZIP Files: Iterates over contents for batch processing.
LLM Integration: Leverages OpenAI's GPT-4 for enhanced extraction quality and contextual understanding.
Efficient and Fast: Optimized for speed while maintaining high accuracy.
Easy Deployment: Dockerized for hassle-free setup and scalability.
Interactive Playground: Test conversion processes interactively using a Gradio-powered interface.
Usage Analytics: Tracks token usage and access statistics via Upstash Redis.

🚀 Use Cases

Knowledge Indexing: Convert various document formats into Markdown for indexing and search.
Text Analysis: Prepare data for semantic analysis and NLP tasks.
Content Transformation: Simplify content preparation for blogs, documentation, or databases.
Metadata Extraction: Extract meaningful metadata from images and audio for categorization and tagging.

🛠️ Getting Started

1. Clone the Repository

git clone https://github.com/lh0x00/docsifer.git
cd docsifer

2. Build and Run with Docker

Make sure Docker is installed and running on your machine.

docker build -t lightweight-embeddings .
docker run -p 7860:7860 lightweight-embeddings

The API will now be accessible at http://localhost:7860.

📖 API Overview

Endpoints

/v1/convert: Convert a file to Markdown. Supports both file uploads and file path inputs. Accepts optional OpenAI parameters to enable LLM-based enhancements.
/v1/stats: Retrieve usage statistics, including access counts and token usage.

Interactive Docs

Visit the Swagger UI for detailed, interactive documentation.
Explore additional resources with ReDoc.

🔬 Playground

Interactive Conversion

Test file conversion directly in the browser using the Gradio interface.
Simply visit http://localhost:7860 after starting the server to access the playground.

Features

File Upload: Upload a file directly or provide a local file path.
OpenAI Integration: Optionally provide OpenAI API details to enhance conversion with LLM capabilities.
Conversion Result: View the resulting Markdown output instantly.
Usage Statistics: Monitor access and token usage through the Gradio interface.

🌐 Resources

Documentation: Explore full documentation
Hugging Face Space: Try the live demo
GitHub Repository: View source code

💡 Why Docsifer?

Versatile and Comprehensive: Handles a wide range of formats, making it a one-stop solution for content conversion.
AI-Powered: Uses OpenAI's GPT-4 to enhance extraction accuracy and adapt to complex data structures.
User-Friendly: Offers intuitive APIs and a built-in interactive interface for experimentation.
Scalable and Efficient: Optimized for performance with Docker support and asynchronous processing.
Transparent Analytics: Tracks usage metrics to help monitor and manage service consumption.

👥 Contributors

lamhieu / lh0x00 – Creator and Maintainer (GitHub, HuggingFace)

Contributions are welcome! Check out the contribution guidelines.

📜 License

This project is licensed under the MIT License. See the LICENSE file for details.

Related Skills

feishu-drive

334.1k

things-mac

334.1k

Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)

clawhub

334.1k

Use the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com

convex_rules

--- description: Guidelines and best practices for building Convex projects, including database schema design, queries, mutations, and real-world examples globs: / .ts, / .tsx, / .js, / .jsx -

lh0x00

View profile

View on GitHub

GitHub Stars9

CategoryData

Updated1mo ago

Forks1

lh0x00/docsifer

Languages

Python

Security Score

90/100

Audited on Jan 29, 2026

No findings