SkillAgentSearch skills...

Seekdb

The AI-Native Search Database. Unifies vector, text, structured and semi-structured data in a single engine, enabling hybrid search and in-database AI workflows.

Install / Use

/learn @oceanbase/Seekdb

README

<div align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://mdn.alipayobjects.com/huamei_ytl0i7/afts/img/A*pKqtRILxGioAAAAAQLAAAAgAejCYAQ/original" width="420"> <source media="(prefers-color-scheme: light)" srcset="https://mdn.alipayobjects.com/huamei_ytl0i7/afts/img/A*6BO4Q6D78GQAAAAAQFAAAAgAejCYAQ/original" width="420"> <img alt="示意图" src="light-mode.png"> </picture>

🔷 The AI-Native Search Database

Unifies vector, text, structured and semi-structured data in a single engine, enabling hybrid search and in-database AI workflows.

</div>
<div align="center"> <p> <a href="https://oceanbase.ai"> <img alt="Documentation" height="20" src="https://img.shields.io/badge/OceanBase.ai-4285F4?style=for-the-badge&logo=read-the-docs&logoColor=white" /> </a> <a href="https://www.linkedin.com/company/oceanbase" target="_blank"> <img src="https://custom-icon-badges.demolab.com/badge/LinkedIn-0A66C2?logo=linkedin-white&logoColor=fff" alt="follow on LinkedIn"> </a> <a href="https://www.youtube.com/@OceanBaseDB"> <img alt="Static Badge" src="https://img.shields.io/badge/YouTube-red?logo=youtube"> </a> <a href="https://deepwiki.com/oceanbase/seekdb"> <img alt="Ask DeepWiki" src="https://deepwiki.com/badge.svg" /> </a> <a href="https://discord.gg/74cF8vbNEs"> <img alt="Join Discord" src="https://img.shields.io/badge/Discord-Join%20Chat-5865F2?logo=discord&style=flat-square" /> </a> <a href="https://pepy.tech/projects/pylibseekdb"> <img height="20" alt="Downloads" src="https://static.pepy.tech/badge/pylibseekdb" /> </a> <a href="https://github.com/oceanbase/seekdb/blob/master/LICENSE"> <img alt="License" src="https://img.shields.io/badge/License-Apache_2.0-blue.svg" /> </a> </p> </div> <div align="center">

English | 中文版


</div>

🚀 What is OceanBase seekdb?

OceanBase seekdb is an AI-native search database that unifies relational, vector, text, JSON and GIS in a single engine, enabling hybrid search and in-database AI workflows.


🔥 Why OceanBase seekdb?

| Feature | seekdb | OceanBase | Chroma | Milvus | MySQL 9.0 | PostgreSQL<br/>+pgvector | DuckDB | Elasticsearch | | ------------------------ |:--------------------:|:-------------:|:----------:|:----------:|:-----------------------:|:----------------------------:|:----------:|:-----------------------------------:| | Embedded | ✅ | ❌ | ✅ | ✅ | ❌<sup>[1]</sup> | ❌ | ✅ | ❌ | | Single-Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Distributed | ❌ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | | MySQL Compatible | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ✅ | ❌ | | Vector Search | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | | Full-Text Search | ✅ | ✅ | ✅ | ⚠️ | ✅ | ✅ | ✅ | ✅ | | Hybrid Search | ✅ | ✅ | ✅ | ✅ | ❌ | ⚠️ | ❌ | ✅ | | OLTP | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | | OLAP | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ⚠️ | | License | Apache 2.0 | MulanPubL 2.0 | Apache 2.0 | Apache 2.0 | GPL 2.0 | PostgreSQL License | MIT | AGPLv3<br/>+SSPLv1<br/>+Elastic 2.0 |

[1] Embedded capability is removed in MySQL 8.0

  • ✅ Supported
  • ❌ Not Supported
  • ⚠️ Limited

✨ Key Features

Build fast + Hybrid search + Multi model

  1. Build fast: From prototype to production in minutes: create AI apps using Python, run VectorDBBench on 1C2G.
  2. Hybrid Search: Combine vector search, full-text search and relational query in a single statement.
  3. Multi-Model: Support relational, vector, text, JSON and GIS in a single engine.

AI inside + SQL inside

  1. AI Inside: Run embedding, reranking, LLM inference and prompt management inside the database, supporting a complete document-in/data-out RAG workflow.
  2. SQL Inside: Powered by the proven OceanBase engine, delivering real-time writes and queries with full ACID compliance, and seamless MySQL ecosystem compatibility.

🎬 Quick Start

Installation

Choose your platform:

<details> <summary><b>🐍 Python (Recommended for AI/ML)</b></summary>
pip install -U pyseekdb
</details> <details> <summary><b>🐳 Docker (Quick Testing)</b></summary>
docker run -d \
  --name seekdb \
  -p 2881:2881 \
  -p 2886:2886 \
  -v ./data:/var/lib/oceanbase \
  oceanbase/seekdb:latest

Please refer to the document of this docker image for details.

</details> <details> <summary><b>📦 Binary (Standalone)</b></summary>
# Linux
rpm -ivh seekdb-1.x.x.x-xxxxxxx.el8.x86_64.rpm

Please replace the version number with the actual RPM package version.

</details>

🎯 AI Search Example

Build a semantic search system in 5 minutes:

<details> <summary><b>🗄️ 🐍 Python SDK</b></summary>
# install sdk first
pip install -U pyseekdb
"""
this example demonstrates the most common operations with embedding functions:
1. Create a client connection
2. Create a collection with embedding function
3. Add data using documents (embeddings auto-generated)
4. Query using query texts (embeddings auto-generated)
5. Print query results

This is a minimal example to get you started quickly with embedding functions.
"""

import pyseekdb
from pyseekdb import DefaultEmbeddingFunction

# ==================== Step 1: Create Client Connection ====================
# You can use embedded mode, server mode, or OceanBase mode
# For this example, we'll use server mode (you can change to embedded or OceanBase)

# Embedded mode (local SeekDB)
client = pyseekdb.Client(
    path="./seekdb.db",
    database="test"
)
# Alternative: Server mode (connecting to remote SeekDB server)
# client = pyseekdb.Client(
#     host="127.0.0.1",
#     port=2881,
#     database="test",
#     user="root",
#     password=""
# )

# Alternative: Remote server mode (OceanBase Server)
# client = pyseekdb.Client(
#     host="127.0.0.1",
#     port=2881,
#     tenant="test",  # OceanBase default tenant
#     database="test",
#     user="root",
#     password=""
# )

# ==================== Step 2: Create a Collection with Embedding Function ====================
# A collection is like a table that stores documents with vector embeddings
collection_name = "my_simple_collection"

# Create collection with default embedding function
# The embedding function will automatically convert documents to embeddings
collection = client.create_collection(
    name=collection_name,
    #embedding_function=DefaultEmbeddingFunction()  # Uses default model (384 dimensions)
)

print(f"Created collection '{collection_name}' with dimension: {collection.dimension}")
print(f"Embedding function: {collection.embedding_function}")

# ==================== Step 3: Add Data to Collection ====================
# With embedding function, you can add documents directly without providing embeddings
# The embedding function will automatically generate embeddings from documents

documents = [
    "Machine learning is a subset of artificial intelligence",
    "Python is a popular programming language",
    "Vector databases enable semantic search",
    "Neural networks are inspired by the human brain",
    "Natural language processing helps computers understand text"
]

ids = ["id1", "id2", "id3", "id4", "id5"]

# Add data with documents only - embeddings will be auto-generated by embedding function
collection.add(
    ids=ids,
    documents=documents,  # embeddings will be automatically generated
    metadatas=[
        {"category": "AI", "index": 0},
        {"category": "Programming", "index": 1},
        {"category": "Database", "index": 2},
        {"category": "AI", "index": 3},
        {"category": "NLP", "index": 4}
    ]
)

print(f"\nAdded {len(documents)} documents to collection")
print("Note: Embeddings were automatically generated from documents using the embedding function")

# ==================== Step 4: Query the Collection ====================
# With embedding function, you can query using text directly
# The embedding function will automatically convert query text to query vector

# Query using text - query vector will be auto-generated by embedding function
query_text = "artificial intelligence and machine learning"

results = collection.query(
    query_texts=query_text,  # Query text - will be embedded automatically
    n_results=3  # Return top 3 most similar documents
)

print(f"\nQuery: '{query_text}'")
print(f"Query results: {len(results['ids'][0])} items found")

# ==================== Step 5
View on GitHub
GitHub Stars2.5k
CategoryData
Updated17h ago
Forks227

Languages

C++

Security Score

100/100

Audited on Mar 27, 2026

No findings