Omnidocs
Automated documentation crawler that generates LLM-friendly Markdown from any docs site. Export as single or multi-file, ready for AI ingestion.
Install / Use
/learn @xVc323/OmnidocsREADME
OmniDocs
A powerful tool for automated documentation site crawling and Markdown conversion. OmniDocs generates LLM-friendly Markdown files—perfect for AI ingestion, semantic search, and knowledge base building. OmniDocs intelligently crawls documentation websites and exports them as well-formatted, structured Markdown files ready for use with large language models.
🌟 Features
- Smart Crawling: Automatically identifies and targets only documentation pages
- Structured Conversion: Preserves document hierarchy and navigation order
- LLM-Optimized Output: Produces clean, consistent Markdown ideal for AI/ML pipelines, RAG, and vector databases
- Flexible Output: Choose between single consolidated Markdown file or multi-file ZIP archive
- High-Fidelity Markdown: Accurately converts tables, code blocks, lists, and more
- User-Friendly Interface: Simple form with advanced options for customization
- Responsive Design: Works on desktop and mobile devices with dark mode support
- Real-time Progress: Live updates during conversion process
- Temporary Storage: Files automatically deleted after 1 hour (users notified)
🌐 Live Demo
Visit omnidocs.pat.network to try OmniDocs now!
📋 User Guide
Basic Usage
- Enter the URL of the documentation site you want to convert
- Click "Convert Site"
- Wait for the conversion to complete (you'll see a progress indicator)
- Download your converted documentation as either:
- A single Markdown file (all_docs.md)
- A ZIP archive containing individual Markdown files
Advanced Options
- Path Prefix: Limit crawling to specific sections of a documentation site
- Include/Exclude Patterns: Fine-tune which pages get crawled using regex patterns
- Output Format: Choose between consolidated Markdown or multi-file ZIP
Important Notes
- Download Your Files Promptly: All converted files are automatically deleted after 1 hour
- Large Sites: Complex documentation sites with many pages may take several minutes to process
- Same-Domain Limitation: OmniDocs only crawls pages within the same domain as the seed URL
🛠️ Installation
Prerequisites
- Python 3.9 or higher
- Node.js 16 or higher
- Redis (for Celery task queue)
- Cloudflare R2 account or compatible S3 storage
Setup
-
Clone the repository:
git clone https://github.com/xvc323/omnidocs.git cd omnidocs -
Install Python dependencies:
pip install -r requirements.txt -
Install frontend dependencies:
cd frontend npm install cd .. -
Set up environment variables (create a
.envfile based onenv.example):R2_ACCOUNT_ID=your_account_id R2_ACCESS_KEY_ID=your_access_key R2_SECRET_ACCESS_KEY=your_secret_key R2_BUCKET_NAME=your_bucket_name
🚀 Running Locally
Start all services with the provided script:
./start-omnidocs.sh
Or start each component manually:
-
Start Redis (required for Celery):
redis-server -
Start Celery worker:
celery -A celery_app worker --loglevel=info -
Start Celery beat (for scheduled tasks):
celery -A celery_app beat --loglevel=info -
Start the API server:
uvicorn api_main:app --reload -
Start the frontend (in a separate terminal):
cd frontend && npm run dev -
Open your browser and navigate to
http://localhost:3000
🐳 Docker Deployment
OmniDocs can be deployed using Docker:
docker-compose up -d
For Railway deployments, use the provided Railway configuration files in the railway/ directory.
💻 API Endpoints
POST /convert- Start a new conversion jobGET /download/{jobId}- Download converted fileGET /api/jobs/{jobId}/events- SSE endpoint for job progress updates
📄 License
This project is licensed under the MIT License. See the LICENSE file for details.
🙏 Acknowledgements
Related Skills
node-connect
337.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
prose
337.3kOpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.
frontend-design
83.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
Writing Hookify Rules
83.2kThis skill should be used when the user asks to "create a hookify rule", "write a hook rule", "configure hookify", "add a hookify rule", or needs guidance on hookify rule syntax and patterns.
