UniScientist
UniScientist is designed to advance universal scientific research intelligence through a unified paradigm
Install / Use
/learn @UniPat-AI/UniScientistREADME
UniScientist
<div align="center"> <picture> <img src="./assets/uniscientist.png" width="30%"> </picture> </div> <div align="center" style="line-height: 1;"> <!-- []() --> </div>Advancing Universal Scientific Research Intelligence via Evolving Polymathic Synthesis
News
- [2026-03-14] UniScientist-30B-A3B is now available for download on HuggingFace and ModelScope.
- [2026-03-11] We release the full inference trajectories of UniScientist on the FrontierScience-Research benchmark. Check the
trajectory/folder for details.
UniScientist advances universal scientific research intelligence through a unified paradigm. By reassigning LLMs as cross-disciplinary generators and human experts as high-precision verifiers, it produces research-grade data spanning 50+ scientific disciplines with structured, rubric-based supervision. A 30B-parameter model trained on this data achieves highly competitive performance across five research benchmarks. Read the blog first for a better overall impression.
<div align="center"> <picture> <img src="./assets/benchmark.png" width="85%"> </picture> </div>Overview
UniScientist formalizes open-ended scientific research as Active Evidence Integration and Model Abduction, and proposes the Evolving Polymathic Synthesis paradigm for synthesizing high-quality research problems and evaluation rubrics at scale.
The approach comprises three key components:
- Evolving Polymathic Synthesis — A human-LLM collaborative data paradigm that generates research-grade scientific problems across 50+ disciplines, each accompanied by co-evolved rubrics refined through completeness, consistency, and distinguishability checks.
- Agentic Research Loop — The model conducts scientific research by iteratively acquiring evidence, deriving formally-justified results, and updating hypotheses via abductive inference, using tools including
web_search,google_scholar,page_fetching, andcode_interpreter. - Report Aggregation — Given multiple candidate research reports, the model learns to synthesize a consolidated report integrating the best elements, enabling research quality to self-evolve over time.
Main Results
UniScientist-30B-A3B achieves top-tier performance across all five benchmarks, surpassing much larger proprietary models:
- FrontierScience-Research: 28.3 (surpassing Claude Opus 4.5 at 17.5, GPT-5.2 at 25.2), reaching 33.3 with test-time scaling (Aggr@8)
- FrontierScience-Olympiad: 66.0 without tools, 71.0 with tools + Aggr@8 (matching Claude Opus 4.5)
- DeepResearch Bench: 46.0 (vs. Perplexity Deep Research 42.3, OpenAI Deep Research 47.0)
- DeepResearch Bench II: 48.0 (surpassing OpenAI Deep Research 45.4, Gemini-3-Pro Deep Research 44.6)
- ResearchRubrics: 59.9 (comparable to OpenAI Deep Research 59.7, Gemini Deep Research 61.5)
Repository Structure
UniScientist/
├── local_deploy.sh # Step 1: Deploy local LLM via vLLM
├── inference_local_qwen.sh # Step 2: Run agentic inference (repeat for multiple rollouts)
├── inference_local_qwen.py # Agentic inference engine
├── inference_local_aggregate.py # Step 3: Aggregate multiple rollouts into a final report
├── tools/
│ ├── tool_search.py # Google web search (via Serper API)
│ ├── tool_scholar.py # Google Scholar search (via Serper API)
│ ├── tool_visit.py # Webpage reader (via Jina Reader API) with LLM summarization
│ └── tool_code.py # Python code interpreter
├── trajectory/
│ ├── uniscientist_research_traj.jsonl # Single-rollout trajectories (with tools)
│ ├── uniscientist_research_no_tool_traj.jsonl # Single-rollout trajectories (without tools)
│ └── uniscientist_research_aggregate8_traj.jsonl # Aggregated trajectories (Aggr@8)
├── data/ # Place your input data here (see Data Format below)
├── requirements.txt
└── .gitignore
Quick Start
Install
pip install -r requirements.txt
Step 1: Deploy Local LLM
Edit local_deploy.sh to set MODEL_PATH to your model weights, then:
bash local_deploy.sh
This starts a vLLM OpenAI-compatible server on port 8000. Wait until the server is ready before proceeding.
Step 2: Run Agentic Inference
Edit inference_local_qwen.sh to fill in your API keys and configuration, then run it multiple times to collect diverse rollouts:
# Run N times to collect N rollouts
bash inference_local_qwen.sh
bash inference_local_qwen.sh
bash inference_local_qwen.sh
Each run produces (or appends to) a .jsonl output file named <STORED_MODEL_NAME>_<BENCHMARK>.jsonl.
Step 3: Aggregate Results
Merge multiple rollout results into a single comprehensive report:
python inference_local_aggregate.py \
--model-name "UniScientist-30B-A3B" \
--data-paths rollout_1.jsonl rollout_2.jsonl rollout_3.jsonl \
--benchmark research \
--rollout-num 1 \
--llm-max-concurrency 32
Configuration
API Keys
The following API keys are required for the tool suite:
| Key | Service | Description |
|-----|---------|-------------|
| SERPER_KEY_ID | Serper | Google web search & Google Scholar |
| JINA_API_KEYS | Jina Reader | Webpage content reading |
| OPENROUTER_API_KEY | OpenRouter | LLM-based webpage summarization |
Data Format
Place your input data in the data/ directory as .jsonl files:
{"problem": "Your research question here", "answer": "Ground truth answer / rubrics (optional)"}
Aggregation Arguments
| Argument | Required | Default | Description |
|----------|----------|---------|-------------|
| --model-name | Yes | - | Model identifier for naming the output file |
| --data-paths | Yes | - | One or more rollout .jsonl files to aggregate |
| --benchmark | No | research | Benchmark name for naming the output file |
| --rollout-num | No | 1 | Number of aggregation passes per question cluster |
| --local-base-url | No | http://localhost:8000/v1 | vLLM server endpoint |
| --output-path | No | auto-generated | Custom output file path |
| --llm-max-concurrency | No | 32 | Max concurrent LLM requests |
Citation
If you find UniScientist useful in your research, please cite:
@misc{unipat2026uniscientist,
title = {UniScientist: Advancing Universal Scientific Research Intelligence},
author = {Baixuan Li and Jialong Wu and Yida Zhao and Wendong Xu and Xuanzhong Chen and Huifeng Yin and Liang Chen and Wentao Zhang and Kuan Li},
year = {2026},
url = {https://unipat.ai/blog/UniScientist}
}
Contact
We are continuously expanding the Universal Scientific Research Dataset to cover additional disciplines and research paradigms. We welcome collaborations with research teams interested in advancing scientific research intelligence. Reach out at contact@unipat.ai.
Related Skills
openpencil
2.0kThe world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.
HappyColorBlend
HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to
Flyaro-waffle-app
Waffle Delight - Full Stack MERN Application Rules & Documentation Project Overview A comprehensive waffle delivery application built with MERN stack featuring premium UI/UX, admin management, a
ui-ux-pro-max-skill
57.0kAn AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
