Verba
Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
Install / Use
/learn @weaviate/VerbaREADME
Verba
The Golden RAGtriever - Community Edition ✨
Welcome to Verba: The Golden RAGtriever, an community-driven open-source application designed to offer an end-to-end, streamlined, and user-friendly interface for Retrieval-Augmented Generation (RAG) out of the box. In just a few easy steps, explore your datasets and extract insights with ease, either locally with Ollama and Huggingface or through LLM providers such as Anthrophic, Cohere, and OpenAI. This project is built with and for the community, please be aware that it might not be maintained with the same urgency as other Weaviate production applications. Feel free to contribute to the project and help us make Verba even better! <3
pip install goldenverba

- Verba
- ✨ Getting Started with Verba
- 🔑 API Keys
- Quickstart: Deploy with pip
- Quickstart: Build from Source
- Quickstart: Deploy with Docker
- 💾 Verba Walkthrough
- 💖 Open Source Contribution
- 🚩 Known Issues
- ❔FAQ
What Is Verba?
Verba is a fully-customizable personal assistant utilizing Retrieval Augmented Generation (RAG) for querying and interacting with your data, either locally or deployed via cloud. Resolve questions around your documents, cross-reference multiple data points or gain insights from existing knowledge bases. Verba combines state-of-the-art RAG techniques with Weaviate's context-aware database. Choose between different RAG frameworks, data types, chunking & retrieving techniques, and LLM providers based on your individual use-case.
Open Source Spirit
Weaviate is proud to offer this open-source project for the community. While we strive to address issues as fast as we can, please understand that it may not be maintained with the same rigor as production software. We welcome and encourage community contributions to help keep it running smoothly. Your support in fixing open issues quickly is greatly appreciated.
Watch our newest Verba video here:
Feature Lists
| 🤖 Model Support | Implemented | Description | | --------------------------------- | ----------- | ------------------------------------------------------- | | Ollama (e.g. Llama3) | ✅ | Local Embedding and Generation Models powered by Ollama | | HuggingFace (e.g. MiniLMEmbedder) | ✅ | Local Embedding Models powered by HuggingFace | | Cohere (e.g. Command R+) | ✅ | Embedding and Generation Models by Cohere | | Anthrophic (e.g. Claude Sonnet) | ✅ | Embedding and Generation Models by Anthrophic | | OpenAI (e.g. GPT4) | ✅ | Embedding and Generation Models by OpenAI | | Groq (e.g. Llama3) | ✅ | Generation Models by Groq (LPU inference) | | Novita AI (e.g. Llama3.3) | ✅ | Generation Models by Novita AI | | Upstage (e.g. Solar) | ✅ | Embedding and Generation Models by Upstage |
| 🤖 Embedding Support | Implemented | Description | | -------------------- | ----------- | ---------------------------------------- | | Weaviate | ✅ | Embedding Models powered by Weaviate | | Ollama | ✅ | Local Embedding Models powered by Ollama | | SentenceTransformers | ✅ | Embedding Models powered by HuggingFace | | Cohere | ✅ | Embedding Models by Cohere | | VoyageAI | ✅ | Embedding Models by VoyageAI | | OpenAI | ✅ | Embedding Models by OpenAI | | Upstage | ✅ | Embedding Models by Upstage |
| 📁 Data Support | Implemented | Description | | -------------------------------------------------------- | ----------- | ---------------------------------------------- | | UnstructuredIO | ✅ | Import Data through Unstructured | | Firecrawl | ✅ | Scrape and Crawl URL through Firecrawl | | UpstageDocumentParse | ✅ | Parse Documents through Upstage Document AI | | PDF Ingestion | ✅ | Import PDF into Verba | | GitHub & GitLab | ✅ | Import Files from Github and GitLab | | CSV/XLSX Ingestion | ✅ | Import Table Data into Verba | | .DOCX | ✅ | Import .docx files | | Multi-Modal (using AssemblyAI) | ✅ | Import and Transcribe Audio through AssemblyAI |
| ✨ RAG Features | Implemented | Description | | ----------------------- | --------------- | ------------------------------------------------------------------------- | | Hybrid Search | ✅ | Semantic Search combined with Keyword Search | | Autocomplete Suggestion | ✅ | Verba suggests autocompletion | | Filtering | ✅ | Apply Filters (e.g. documents, document types etc.) before performing RAG | | Customizable Metadata | ✅ | Free control over Metadata | | Async Ingestion | ✅ | Ingest data asynchronously to speed up the process | | Advanced Querying | planned ⏱️ | Task Delegation Based on LLM Evaluation | | Reranking | planned ⏱️ | Rerank results based on context for improved results | | RAG Evaluation | planned ⏱️ | Interface for Evaluating RAG pipelines | | Agentic RAG | out of scope ❌ | Agentic RAG pipelines | | Graph RAG | out of scope ❌ | Graph-based RAG pipelines |
| 🗡️ Chunking Techniques | Implemented | Description | | ---------------------- | ----------- | ------------------------------------------------------- | | Token | ✅ | Chunk by Token powered by spaCy | | Sentence | ✅ | Chunk by Sentence powered by spaCy | | Semantic | ✅ | Chunk and group by semantic sentence similarity | | Recursive | ✅ | Recursively chunk data based on rules | | HTML | ✅ | Chunk HTML files | | Markdown | ✅ | Chunk Markdown files | | Code | ✅ | Chunk Code files | | JSON | ✅ | Chunk JSON files |
| 🆒 Cool Bonus | Implemented | Description | | ------------------------ | --------------- | ------------------------------------------------------- | | Docker Support | ✅ | Verba is deployable via Docker | | Customizable Frontend | ✅ | Verba's frontend is fully-customizable via the frontend | | Vector Viewer | ✅ | Visualize your data in 3D | | Multi-User Collaboration | out of scope ❌ | Multi-User Collaboration in Verba |
| 🤝 RAG Libraries | Implemented | Description | | ---------------- | ----------- | ---------------------------------- | | LangChain | ✅ | Implement LangChain RAG pipelines | | Haystack | planned ⏱️ | Implement Haystack RAG pipelines | | LlamaIndex | planned ⏱️ | Implement LlamaIndex RAG pipelines |
Someth
