MyScaleDB

A @ClickHouse fork that supports high-performance vector search and full-text search.

Generate Convert Improve

Install / Use

/learn @myscale/MyScaleDB

About this skill

Quality Score

0/100

README

MyScaleDB

Enable every developer to build production-grade GenAI applications with powerful and familiar SQL.

</div>

What is MyScaleDB?

MyScaleDB is the SQL vector database that enables developers to build production-ready and scalable AI applications using familiar SQL. It is built on top of ClickHouse and optimized for AI applications and solutions, allowing developers to effectively manage and process massive volumes of data.

Key benefits of using MyScaleDB include:

Fully SQL-Compatible
- Fast, powerful, and efficient vector search, filtered search, and SQL-vector join queries.
- Use SQL with vector-related functions to interact with MyScaleDB. No need to learn complex new tools or frameworks – stick with what you know and love.
Production-Ready for AI applications
- A unified and time-tested platform to manage and process structured data, text, vector, JSON, geospatial, time-series data, and more. See supported data types and functions
- Improved RAG accuracy by combining vectors with rich metadata, full-text search, and performing high-precision, high-efficiency filtered search at any ratio[^1].
Unmatched performance and scalability
- MyScaleDB leverages cutting-edge OLAP database architecture and advanced vector algorithms for lightning-fast vector operations.
- Scale your applications effortlessly and cost-effectively as your data grows.

[^1]: See why metadata filtering is crucial for imporoving RAG accuracy here.

MyScale Cloud provides fully-managed MyScaleDB with premium features on billion-scale data[^2]. Compared with specialized vector databases that use custom APIs, MyScale is more powerful, performant, and cost-effective while remaining simpler to use. This makes it suitable for a large community of programmers. Additionally, when compared to integrated vector databases like PostgreSQL with pgvector or ElasticSearch with vector extensions, MyScale consumes fewer resources and achieves better accuracy and speed for structured and vector joint queries, such as filtered searches.

[^2]: The MSTG (Multi-scale Tree Graph) algorithm is provided through MyScale Cloud, achieving high data density with disk-based storage and better indexing & search performance on billion-scale vector data.

Why MyScaleDB

Fully SQL compatible
Unified structured and vectorized data management
Millisecond search on billion-scale vectors
Highly reliable & linearly scalable
Powerful text-search and text/vector hybrid search functions
Complex SQL vector queries
LLM observability with MyScale Telemetry

MyScale unifies three systems: SQL database/data warehouse, vector database, as well as full-text search engine into one system in a highly efficient manner. It not only saves infrastructure and maintenance costs, but enables joint data queries and analytics as well.

See our documentation and blogs for more about MyScale’s unique features and advantages. Our open-source benchmark provides detailed comparison with other vector database products.

Why build MyScaleDB on top of ClickHouse?

ClickHouse is a popular open-source analytical database that excels at big data processing and analytics due to its columnar storage with advanced compression, skip indexing, and SIMD processing. Unlike transactional databases like PostgreSQL and MySQL, which use row storage and main optimzies for transactional processing, ClickHouse has significantly faster analytical and data scanning speeds.

One of the key operations in combining structured and vector search is filtered search, which involves filtering by other attributes first and then performing vector search on the remaining data. Columnar storage and pre-filtering are crucial for ensuring high accuracy and high performance in filtered search, which is why we chose to build MyScaleDB on top of ClickHouse.

While we have modified ClickHouse's execution and storage engine in many ways to ensure fast and cost-effective SQL vector queries, many of the features (#37893, #38048, #37859, #56728, #58223) related to general SQL processing have been contributed back to the ClickHouse open source community.

Quick Start

MyScale Cloud

The simplest way to use MyScaleDB is to create an instance on MyScale Cloud service. You can start from a free pod supporting 5M 768D vectors. Sign up here and checkout MyScaleDB QuickStart for more instructions.

Self-Hosted

Using MyScaleDB Docker Image

To quickly get a MyScaleDB instance up and running, simply pull and run the latest Docker image:

docker run --name myscaledb --net=host myscale/myscaledb:1.8.0

Note: Myscale's default configuration only allows localhost ip access. For the docker run startup method, you need to specify --net=host to access services deployed in docker mode on the current node.

This will start a MyScaleDB instance with default user default and no password. You can then connect to the database using clickhouse-client:

docker exec -it myscaledb clickhouse-client

Using Docker Compose

Use the following recommended directory structure and the location of the docker-compose.yaml file:

> tree myscaledb
myscaledb
├── docker-compose.yaml
└── volumes
    └── config
        └── users.d
            └── custom_users_config.xml

3 directories, 2 files

Define the configuration for your deployment. We recommend starting with the following configuration in your docker-compose.yaml file, which you can adjust based on your specific requirements:

version: '3.7'

services:
  myscaledb:
    image: myscale/myscaledb:1.8.0
    tty: true
    ports:
      - '8123:8123'
      - '9000:9000'
      - '8998:8998'
      - '9363:9363'
      - '9116:9116'
    networks:
      myscaledb_network:
        ipv4_address: 10.0.0.2
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/data:/var/lib/clickhouse
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/log:/var/log/clickhouse-server
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/config/users.d/custom_users_config.xml:/etc/clickhouse-server/users.d/custom_users_config.xml
    deploy:
      resources:
        limits:
          cpus: "16.00"
          memory: 32Gb
networks:
  myscaledb_network:
    driver: bridge
    ipam:
      driver: default
      config:
        - subnet: 10.0.0.0/24

custom_users_config.xml:

<clickhouse>
  <users>
      <default>
          <password></password>
          <networks>
              <ip>::1</ip>
              <ip>127.0.0.1</ip>
              <ip>10.0.0.0/24</ip>
          </networks>
          <profile>default</profile>
          <quota>default</quota>
          <access_management>1</access_management>
      </default>
  </users>
</clickhouse>

Note: The custom_users_config configuration allows you to use the default user to access the database on the node where the database service is deployed using docker compose. If you want to access the database service on other nodes, it is recommended to create a user that can be accessed through other IPs. For detailed settings, see: MyScaleDB Create User. You can also customize the configuration file of MyScaleDB. Copy the /etc/clickhouse-server directory from your myscaledb container to your local drive, modify the configuration, and add a directory mapping to the docker-compose.yaml file to make the configuration

Related Skills

oracle

338.7k

Best practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).

prose

338.7k

OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.

Command Development

83.6k

This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.

Plugin Structure

83.6k

This skill should be used when the user asks to "create a plugin", "scaffold a plugin", "understand plugin structure", "organize plugin components", "set up plugin.json", "use ${CLAUDE_PLUGIN_ROOT}", "add commands/agents/skills/hooks", "configure auto-discovery", or needs guidance on plugin directory layout, manifest configuration, component organization, file naming conventions, or Claude Code plugin architecture best practices.

myscale

View profile

View on GitHub

GitHub Stars1.0k

CategoryData

Updated1d ago

Forks71

myscale/MyScaleDB

Languages

C++

Security Score

100/100

Audited on Mar 26, 2026

No findings