Docsifer
Docsifer is a powerful tool for converting various data formats into Markdown for applications such as indexing, text analysis, and more. It supports PDF, PowerPoint, Word, Excel, Images, Audio, HTML, and other text-based formats, and leverages LLMs to enhance performance.
Install / Use
/learn @lh0x00/DocsiferREADME
title: Docsifer emoji: 👻 / 📚 colorFrom: green colorTo: indigo sdk: docker app_file: app.py pinned: false
📄 Docsifer: Efficient Data Conversion to Markdown
Docsifer is a powerful FastAPI + Gradio service for converting various data formats (PDF, PowerPoint, Word, Excel, Images, Audio, HTML, etc.) to Markdown. It leverages the MarkItDown library and can optionally use LLMs (via OpenAI) for richer extraction (OCR, speech-to-text, etc.).
✨ Key Features
- Comprehensive Format Support:
- PDF: Extracts text and structure effectively.
- PowerPoint: Converts slides into Markdown-friendly content.
- Word: Processes
.docxfiles with precision. - Excel: Extracts tabular data as Markdown tables.
- Images: Reads EXIF metadata and applies OCR for text extraction.
- Audio: Retrieves EXIF metadata and performs speech transcription.
- HTML: Transforms web pages into Markdown.
- Text-Based Formats: Handles CSV, JSON, XML with ease.
- ZIP Files: Iterates over contents for batch processing.
- LLM Integration: Leverages OpenAI's GPT-4 for enhanced extraction quality and contextual understanding.
- Efficient and Fast: Optimized for speed while maintaining high accuracy.
- Easy Deployment: Dockerized for hassle-free setup and scalability.
- Interactive Playground: Test conversion processes interactively using a Gradio-powered interface.
- Usage Analytics: Tracks token usage and access statistics via Upstash Redis.
🚀 Use Cases
- Knowledge Indexing: Convert various document formats into Markdown for indexing and search.
- Text Analysis: Prepare data for semantic analysis and NLP tasks.
- Content Transformation: Simplify content preparation for blogs, documentation, or databases.
- Metadata Extraction: Extract meaningful metadata from images and audio for categorization and tagging.
🛠️ Getting Started
1. Clone the Repository
git clone https://github.com/lh0x00/docsifer.git
cd docsifer
2. Build and Run with Docker
Make sure Docker is installed and running on your machine.
docker build -t lightweight-embeddings .
docker run -p 7860:7860 lightweight-embeddings
The API will now be accessible at http://localhost:7860.
📖 API Overview
Endpoints
/v1/convert: Convert a file to Markdown. Supports both file uploads and file path inputs. Accepts optional OpenAI parameters to enable LLM-based enhancements./v1/stats: Retrieve usage statistics, including access counts and token usage.
Interactive Docs
- Visit the Swagger UI for detailed, interactive documentation.
- Explore additional resources with ReDoc.
🔬 Playground
Interactive Conversion
- Test file conversion directly in the browser using the Gradio interface.
- Simply visit
http://localhost:7860after starting the server to access the playground.
Features
- File Upload: Upload a file directly or provide a local file path.
- OpenAI Integration: Optionally provide OpenAI API details to enhance conversion with LLM capabilities.
- Conversion Result: View the resulting Markdown output instantly.
- Usage Statistics: Monitor access and token usage through the Gradio interface.
🌐 Resources
- Documentation: Explore full documentation
- Hugging Face Space: Try the live demo
- GitHub Repository: View source code
💡 Why Docsifer?
- Versatile and Comprehensive: Handles a wide range of formats, making it a one-stop solution for content conversion.
- AI-Powered: Uses OpenAI's GPT-4 to enhance extraction accuracy and adapt to complex data structures.
- User-Friendly: Offers intuitive APIs and a built-in interactive interface for experimentation.
- Scalable and Efficient: Optimized for performance with Docker support and asynchronous processing.
- Transparent Analytics: Tracks usage metrics to help monitor and manage service consumption.
👥 Contributors
- lamhieu / lh0x00 – Creator and Maintainer (GitHub, HuggingFace)
Contributions are welcome! Check out the contribution guidelines.
📜 License
This project is licensed under the MIT License. See the LICENSE file for details.
Related Skills
feishu-drive
334.1k|
things-mac
334.1kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
334.1kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
convex_rules
--- description: Guidelines and best practices for building Convex projects, including database schema design, queries, mutations, and real-world examples globs: / .ts, / .tsx, / .js, / .jsx -
