UniScientist

UniScientist is designed to advance universal scientific research intelligence through a unified paradigm

Generate Convert Improve

Install / Use

/learn @UniPat-AI/UniScientist

About this skill

Quality Score

0/100

README

UniScientist

</div>

Advancing Universal Scientific Research Intelligence via Evolving Polymathic Synthesis

News

[2026-03-14] UniScientist-30B-A3B is now available for download on HuggingFace and ModelScope.
[2026-03-11] We release the full inference trajectories of UniScientist on the FrontierScience-Research benchmark. Check the trajectory/ folder for details.

UniScientist advances universal scientific research intelligence through a unified paradigm. By reassigning LLMs as cross-disciplinary generators and human experts as high-precision verifiers, it produces research-grade data spanning 50+ scientific disciplines with structured, rubric-based supervision. A 30B-parameter model trained on this data achieves highly competitive performance across five research benchmarks. Read the blog first for a better overall impression.

Overview

UniScientist formalizes open-ended scientific research as Active Evidence Integration and Model Abduction, and proposes the Evolving Polymathic Synthesis paradigm for synthesizing high-quality research problems and evaluation rubrics at scale.

The approach comprises three key components:

Evolving Polymathic Synthesis — A human-LLM collaborative data paradigm that generates research-grade scientific problems across 50+ disciplines, each accompanied by co-evolved rubrics refined through completeness, consistency, and distinguishability checks.
Agentic Research Loop — The model conducts scientific research by iteratively acquiring evidence, deriving formally-justified results, and updating hypotheses via abductive inference, using tools including web_search, google_scholar, page_fetching, and code_interpreter.
Report Aggregation — Given multiple candidate research reports, the model learns to synthesize a consolidated report integrating the best elements, enabling research quality to self-evolve over time.

Main Results

UniScientist-30B-A3B achieves top-tier performance across all five benchmarks, surpassing much larger proprietary models:

FrontierScience-Research: 28.3 (surpassing Claude Opus 4.5 at 17.5, GPT-5.2 at 25.2), reaching 33.3 with test-time scaling (Aggr@8)
FrontierScience-Olympiad: 66.0 without tools, 71.0 with tools + Aggr@8 (matching Claude Opus 4.5)
DeepResearch Bench: 46.0 (vs. Perplexity Deep Research 42.3, OpenAI Deep Research 47.0)
DeepResearch Bench II: 48.0 (surpassing OpenAI Deep Research 45.4, Gemini-3-Pro Deep Research 44.6)
ResearchRubrics: 59.9 (comparable to OpenAI Deep Research 59.7, Gemini Deep Research 61.5)

Repository Structure

UniScientist/
├── local_deploy.sh                 # Step 1: Deploy local LLM via vLLM
├── inference_local_qwen.sh         # Step 2: Run agentic inference (repeat for multiple rollouts)
├── inference_local_qwen.py         # Agentic inference engine
├── inference_local_aggregate.py    # Step 3: Aggregate multiple rollouts into a final report
├── tools/
│   ├── tool_search.py              # Google web search (via Serper API)
│   ├── tool_scholar.py             # Google Scholar search (via Serper API)
│   ├── tool_visit.py               # Webpage reader (via Jina Reader API) with LLM summarization
│   └── tool_code.py                # Python code interpreter
├── trajectory/
│   ├── uniscientist_research_traj.jsonl          # Single-rollout trajectories (with tools)
│   ├── uniscientist_research_no_tool_traj.jsonl  # Single-rollout trajectories (without tools)
│   └── uniscientist_research_aggregate8_traj.jsonl # Aggregated trajectories (Aggr@8)
├── data/                           # Place your input data here (see Data Format below)
├── requirements.txt
└── .gitignore

Quick Start

Install

pip install -r requirements.txt

Step 1: Deploy Local LLM

Edit local_deploy.sh to set MODEL_PATH to your model weights, then:

bash local_deploy.sh

This starts a vLLM OpenAI-compatible server on port 8000. Wait until the server is ready before proceeding.

Step 2: Run Agentic Inference

Edit inference_local_qwen.sh to fill in your API keys and configuration, then run it multiple times to collect diverse rollouts:

# Run N times to collect N rollouts
bash inference_local_qwen.sh
bash inference_local_qwen.sh
bash inference_local_qwen.sh

Each run produces (or appends to) a .jsonl output file named <STORED_MODEL_NAME>_<BENCHMARK>.jsonl.

Step 3: Aggregate Results

Merge multiple rollout results into a single comprehensive report:

python inference_local_aggregate.py \
    --model-name "UniScientist-30B-A3B" \
    --data-paths rollout_1.jsonl rollout_2.jsonl rollout_3.jsonl \
    --benchmark research \
    --rollout-num 1 \
    --llm-max-concurrency 32

Configuration

API Keys

The following API keys are required for the tool suite:

| Key | Service | Description | |-----|---------|-------------| | SERPER_KEY_ID | Serper | Google web search & Google Scholar | | JINA_API_KEYS | Jina Reader | Webpage content reading | | OPENROUTER_API_KEY | OpenRouter | LLM-based webpage summarization |

Data Format

Place your input data in the data/ directory as .jsonl files:

{"problem": "Your research question here", "answer": "Ground truth answer / rubrics (optional)"}

Aggregation Arguments

| Argument | Required | Default | Description | |----------|----------|---------|-------------| | --model-name | Yes | - | Model identifier for naming the output file | | --data-paths | Yes | - | One or more rollout .jsonl files to aggregate | | --benchmark | No | research | Benchmark name for naming the output file | | --rollout-num | No | 1 | Number of aggregation passes per question cluster | | --local-base-url | No | http://localhost:8000/v1 | vLLM server endpoint | | --output-path | No | auto-generated | Custom output file path | | --llm-max-concurrency | No | 32 | Max concurrent LLM requests |

Citation

If you find UniScientist useful in your research, please cite:

@misc{unipat2026uniscientist,
  title   = {UniScientist: Advancing Universal Scientific Research Intelligence},
  author  = {Baixuan Li and Jialong Wu and Yida Zhao and Wendong Xu and Xuanzhong Chen and Huifeng Yin and Liang Chen and Wentao Zhang and Kuan Li},
  year    = {2026},
  url     = {https://unipat.ai/blog/UniScientist}
}

Contact

We are continuously expanding the Universal Scientific Research Dataset to cover additional disciplines and research paradigms. We welcome collaborations with research teams interested in advancing scientific research intelligence. Reach out at contact@unipat.ai.

Related Skills

openpencil

2.0k

The world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.

HappyColorBlend

HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to

Flyaro-waffle-app

Waffle Delight - Full Stack MERN Application Rules & Documentation Project Overview A comprehensive waffle delivery application built with MERN stack featuring premium UI/UX, admin management, a

ui-ux-pro-max-skill

57.0k

An AI SKILL that provide design intelligence for building professional UI/UX multiple platforms