ChatBase

No description available

Generate Convert Improve

Install / Use

/learn @XuanheZhou/ChatBase

About this skill

Quality Score

0/100

README

<p align="center"> 【English | <a href="README_Chinese.md">中文</a>】 </p>

🕹 Quick Start

1. Getting Started

Environment Setup

Backend Environment Setup

First, ensure your machine has Python 3.8 - 3.10 installed.

$ python --version
Python 3.10.12

Next, create a virtual environment and install the project dependencies within it.


# Clone the repository
$ git clone https://github.com/zhouxh19/ChatBase.git

# Enter the directory
$ cd ChatBase

# Install all dependencies
$ pip3 install -r requirements.txt 

# To run only the API service
$ pip3 install -r requirements_api.txt 

# The default dependencies include the basic runtime environment (Chroma-DB vector library). If you want to use other vector libraries, uncomment the corresponding dependencies in requirements.txt before installing.

Frontend Service Setup

First, ensure your machine has Node (>= 18.15.0) installed.

$ node -v
v18.15.0

Next, install the project dependencies.

cd webui
# pnpm address https://pnpm.io/zh/motivation
# Install dependencies (Recommend using pnpm)
# You can use "npm -g i pnpm" to install pnpm 
pnpm install

Download Embedding Model from HuggingFace.

To download the model, you need to install Git LFS first, then run:

$ git lfs install
$ git clone https://huggingface.co/moka-ai/m3e-base

Adjust the model settings to the download path, for example:

EMBEDDING_MODEL = "m3e-base"
LLM_MODELS = ["Qwen-1_8B-Chat"]
MODEL_PATH = {
    "embed_model": {
        "m3e-base": "m3e-base", # Download path of embedding model.
    },

    "llm_model": {
        "Qwen-1_8B-Chat": "Qwen-1_8B-Chat", # Download path of LLM.
    },
}

Modify Configuration Files

Copy the configuration files and check each file's comments to modify them according to your needs.

$ python copy_config_example.py
# The generated configuration files are in the configs/ directory
# basic_config.py is the basic configuration file and doesn't need to be modified.
# kb_config.py is the knowledge base configuration file, you can modify DEFAULT_VS_TYPE to specify the storage vector library of the knowledge base, and you can also modify the relevant paths.
# model_config.py is the model configuration file, you can modify LLM_MODELS to specify the models used. The current model configuration is mainly for knowledge base search, diagnostic-related models have some hard coding in the code, and will be unified here later.
# prompt_config.py is the prompt configuration file, mainly for LLM dialogue and knowledge base prompts.
# server_config.py is the service configuration file, mainly for the service port number, etc.

!!! Note: Please modify the following configurations before initializing the knowledge base, otherwise it may cause database initialization failure.

model_config.py

# EMBEDDING_MODEL   Vectorization model, if you choose a local model, download it to the root directory as needed.
# LLM_MODELS        LLM, if you choose a local model, download it to the root directory as needed.
# ONLINE_LLM_MODEL  If you use an online model, modify the configuration.

server_config.py

# WEBUI_SERVER.api_base_url   Pay attention to this parameter. If deploying the project on a server, modify the configuration.

Initialize the Knowledge Base

Initialize your knowledge base and simply copy the configuration files as follows:

$ python init_database.py --recreate-vs

One-Click Start

Start the project with the following command:

$ python startup.py -a

Example of the startup interface

If it starts successfully, you will see the following interface:

RAG Dialogue Page

Database Dialogue Page:

Database Dialogue Start Page:

Database Dialogue History Page:

Multi-file Linked Dialogue Page:

Knowledge Base Page

Knowledge Base Management Page:

Knowledge Base Details Page:

⏱ Todo

Data-Driven workflow orchestration
ES Service

📒 Citation

Feel free to cite us if you like this project.

@misc{zhao2024llmdbdemo,
      title={Chat2Data: An Interactive Data Analysis System with RAG, Vector Databases and LLMs}, 
      author={Xinyang Zhao, Xuanhe Zhou, Guoliang Li},
      year={2024},
      journal={Proc. {VLDB} Endow.},
}

Related Skills

node-connect

347.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

347.6k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

347.6k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。