SEIMEI
Search-Enhanced Interface for Multi-Expertise Integration (SEIMEI): Realtime-Knowledge-Update AI System with Intelligent Search
Install / Use
/learn @kyotoai/SEIMEIREADME
<a id="readme-top"></a>
<!-- PROJECT SHIELDS --> <!-- *** I'm using markdown "reference style" links for readability. *** Reference links are enclosed in brackets [ ] instead of parentheses ( ). *** See the bottom of this document for the declaration of the reference variables *** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use. *** https://www.markdownguide.org/basic-syntax/#reference-style-links -->[![Paper][paper-shield]][paper-url] [![Document][document-shield]][document-url] [![Contributors][contributors-shield]][contributors-url] [![Forks][forks-shield]][forks-url] [![Stargazers][stars-shield]][stars-url] [![Issues][issues-shield]][issues-url] [![MIT License][license-shield]][license-url] [![LinkedIn][linkedin-shield]][linkedin-url]
<!-- PROJECT LOGO --> <br /> <div align="center"> <a href="https://kyotoai.org"> <img src="images/seimei_architecture.png" alt="Logo" width="640" height="360"> </a> <h3 align="center">SEIMEI</h3> <p align="center"> <strong>S</strong>earch-<strong>E</strong>nhanced <strong>I</strong>nterface for <strong>M</strong>ulti-<strong>E</strong>xpertise <strong>I</strong>ntegration </p> <p align="center"> Unlike conventional RL that only optimizes knowledge inside the LLM, SEIMEI jointly optimizes external knowledge, enabling AI to truly absorb domain-specific and tacit expertise. Build much more personalized AI trained only for you with dramatically lower cost and higher adaptability!! <br /> <a href="https://github.com/kyotoai/SEIMEI/tree/main/demo"><strong>Explore the docs »</strong></a> <br /> <br /> <a href="https://github.com/kyotoai/SEIMEI/tree/main/demo">View Demo</a> · <a href="https://github.com/kyotoai/SEIMEI/issues/new?labels=bug&template=bug-report---.md">Report Bug</a> · <a href="https://github.com/kyotoai/SEIMEI/issues/new?labels=enhancement&template=feature-request---.md">Request Feature</a> </p> </div> <!-- TABLE OF CONTENTS --> <details> <summary>Table of Contents</summary> <ol> <li> <a href="#about-the-project">About The Project</a> </li> <li> <a href="#quick-start">Quick Start</a> <ul> <li><a href="#installation">Installation</a></li> <li><a href="#set-api-key">Set API key</a></li> <li><a href="#run-seimei">Run SEIMEI</a></li> </ul> </li> <li> <a href="#usage-a-integrate-your-own-knowledge">Usage A. Integrate your own knowledge</a> <ul> <li><a href="#1-prepare-your-knowledge-file">1. Prepare your knowledge file</a></li> <li><a href="#2-run-seimei-with-knowledge-loading">2. Run SEIMEI with knowledge loading</a></li> <li><a href="#3-automatic-knowledge-accumulation-optional">3. Automatic knowledge accumulation (optional)</a></li> </ul> </li> <li> <a href="#usage-b-train-reward-model-to-optimize-knowledge">Usage B. Train Reward Model To Optimize Knowledge</a> <ul> <li><a href="#1-inferences-sampling">1. Inferences sampling</a></li> <li><a href="#2-data-conversion">2. Data conversion</a></li> <li><a href="#3-train-reward-model">3. Train reward model</a></li> <li><a href="#4-evaluate-your-model">4. Evaluate your model</a></li> </ul> </li> <li> <a href="#usage-c-cli-chat">Usage C. CLI Chat</a> </li> <li><a href="#contributing">Contributing</a></li> <li><a href="#license">License</a></li> <li><a href="#contact">Contact</a></li> <li><a href="#acknowledgments">Acknowledgments</a></li> </ol> </details> <!-- ABOUT THE PROJECT -->About The Project
Search The Best Knowledge For Accurate Thought
<!-- [![Product Name Screen Shot][product-screenshot]](https://example.com) --> <br /> <div align="center"> <img src="images/seimei_idea.png" alt="seimei" width="640" height="400"> </div> <br /> Here's the example of how SEIMEI works. Each agent interacts with LLM and document and makes inference. These inferences are automatically integrated by search engine and gives an answer of question. <p align="right">(<a href="#readme-top">back to top</a>)</p> <!-- ### The Most Intelligent Search Model <div align="center"> <img src="images/Comparison.png" alt="seimei" width="400" height="360"> </div> Reward model performs better than semantic embedding model(so called vector search). The graph above shows the result of training reward model (3b) and e5-mistral-7b model to search best knowledge. While the vector search model cannot really retrieve best knowledge (because problems and knowledge texts are not similar as sentences), our proprietary search model can learn what knowledge are needed to solve a question and retrieve the best ones. <a href="https://github.com/kyotoai/SEIMEI/tree/main/demo"><strong>See more details »</strong></a> <p align="right">(<a href="#readme-top">back to top</a>)</p> ### Improves Strong Models <div align="center"> <img src="images/Improvement.png" alt="seimei" width="500" height="360"> </div> We acheved an improvement of bigcodebench/deepseek-r1 by our search engine!! <a href="https://github.com/kyotoai/SEIMEI/tree/main/demo"><strong>See more details »</strong></a> <p align="right">(<a href="#readme-top">back to top</a>)</p> ### Built With * [![vLLM][vllm.ai]][vllm-url] * [![Hugging Face][huggingface.co]][huggingface-url] * [OpenAI](https://platform.openai.com/docs/overview) * [![Next][Next.js]][Next-url] * [![React][React.js]][React-url] * [![Vue][Vue.js]][Vue-url] * [![Angular][Angular.io]][Angular-url] * [![Svelte][Svelte.dev]][Svelte-url] * [![Laravel][Laravel.com]][Laravel-url] * [![Bootstrap][Bootstrap.com]][Bootstrap-url] * [![JQuery][JQuery.com]][JQuery-url] <p align="right">(<a href="#readme-top">back to top</a>)</p> --> <!-- Quick Start -->Quick Start
Installation
You can install SEIMEI using git clone the library
git clone https://github.com/kyotoai/SEIMEI.git
pip install -e SEIMEI/
Set API key
-
Get KyotoAI API key from https://kyotoai.net
-
Run
export KYOTOAI_API_KEY="(your_kyotoai_api_key)"
Run SEIMEI
In CLI app
Open seimei terminal app inside your project directory by
seimei
and start asking question.
python code
import asyncio
from seimei import seimei
async def demo_code_act():
orchestrator = seimei(
llm_config={"model": "gpt-5-nano"},
max_tokens_per_question=30000,
)
result = await orchestrator(
messages=[
{"role": "user", "content": "Analyze the current directory and change."},
],
)
asyncio.run(demo_code_act())
<p align="right">(<a href="#readme-top">back to top</a>)</p>
<!-- USAGE EXAMPLES -->
Usage A. Integrate your own knowledge
Overview
-
Prepare a knowledge file.
Create a CSV with reusable hints for each agent (think,code_act,answer,web_search, or*for all agents).
This becomes your portable memory layer that can be reused across runs. -
Run SEIMEI with knowledge loading rules.
Passknowledge_load_configto load CSV/JSON/JSONL files and to inject inline, step-specific hints.
This lets you control both what knowledge is injected and when it is used. -
(Optional) Accumulate new knowledge automatically.
Enableknowledge_generate_configto append run retrospectives into your CSV after each run.
The newly generated rows are returned in the response and immediately reusable.
1. Prepare your knowledge file
Create seimei_knowledge/knowledge.csv (minimum columns: agent, knowledge).
agent,knowledge,tags,step,id
code_act,"Prefer rg before grep when scanning large repos","[\"search\",\"shell\"]",,101
think,"Before choosing next action, summarize the last 2 agent findings in one sentence","[\"planning\"]",">=2",102
answer,"End with a short numbered next-step list when uncertainty remains","[\"response\"]",,103
*,"Always verify file paths before proposing edits","[\"safety\"]",,104
agent: target agent name (*means all agents).knowledge: guidance text injected into that agent.tags(optional): JSON list or comma-separated string.step(optional): step constraint like2,>=2,<4, or>=1,<=3.id(optional): stable identifier for tracking and updates.
You can also bootstrap entries with the built-in generator:
python3 -m seimei.knowledge.generate_from_generators \
--count 25 \
--output seimei_knowledge/knowledge.csv
2. Run SEIMEI with knowledge loading
import asyncio
from seimei import seimei
async def main():
orchestrator = seimei(
llm_config={"model": "gpt-5-nano"},
allow_code_exec=True,
max_tokens_per_question=30000,
)
result = await orchestrator(
messages=[
{"role": "user", "content": "Inspect this repo and suggest a safe cleanup plan."},
],
knowledge_load_config=[
{"load_knowledge_path": "seimei_knowledge/knowledge.csv"},
{
"step": [1, 2],
"agent": "code_act",
"text": "Run read-only commands first (pwd, ls, rg) before any edits.",
"tags": ["safety", "planning"],
},
{
"step": 3,
"agent": ["think", "answer"],
"text": "Explicitly list unresolved uncertainties before finalizing.",
"tags": ["quality"],
},
],
)
print(result["output"])
asyncio.run(main())
3. Automatic knowledge accumulation (optional)
Provide knowledge_generate_config when calling the orchestrator to append run retrospectives into a CSV knowledge base:
result = await orchestrator(
messages=[{"role": "user", "content": "Find clever ways to speed up our ETL pipeline."}],
knowledge_generate_config={
