DeepResearch
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Install / Use
/learn @Alibaba-NLP/DeepResearchREADME
👏 Welcome to try Tongyi DeepResearch via our <img src="./assets/tongyi.png" width="14px" style="display:inline;"> Modelscope online demo or 🤗 Huggingface online demo or <img src="./WebAgent/assets/aliyun.png" width="14px" style="display:inline;"> bailian service!
[!NOTE] This demo is for quick exploration only. Response times may vary or fail intermittently due to model latency and tool QPS limits. For a stable experience we recommend local deployment; for a production-ready service, visit <img src="./WebAgent/assets/aliyun.png" width="14px" style="display:inline;"> bailian and follow the guided setup.
Introduction
We present <img src="./assets/tongyi.png" width="14px" style="display:inline;"> Tongyi DeepResearch, an agentic large language model featuring 30.5 billion total parameters, with only 3.3 billion activated per token. Developed by Tongyi Lab, the model is specifically designed for long-horizon, deep information-seeking tasks. Tongyi DeepResearch demonstrates state-of-the-art performance across a range of agentic search benchmarks, including Humanity's Last Exam, BrowseComp, BrowseComp-ZH, WebWalkerQA,xbench-DeepSearch, FRAMES and SimpleQA.
Tongyi DeepResearch builds upon our previous work on the <img src="./assets/tongyi.png" width="14px" style="display:inline;"> WebAgent project.
More details can be found in our 📰 <a href="https://tongyi-agent.github.io/blog/introducing-tongyi-deep-research/">Tech Blog</a>.
<p align="center"> <img width="100%" src="./assets/performance.png"> </p>Features
- ⚙️ Fully automated synthetic data generation pipeline: We design a highly scalable data synthesis pipeline, which is fully automatic and empowers agentic pre-training, supervised fine-tuning, and reinforcement learning.
- 🔄 Large-scale continual pre-training on agentic data: Leveraging diverse, high-quality agentic interaction data to extend model capabilities, maintain freshness, and strengthen reasoning performance.
- 🔁 End-to-end reinforcement learning: We employ a strictly on-policy RL approach based on a customized Group Relative Policy Optimization framework, with token-level policy gradients, leave-one-out advantage estimation, and selective filtering of negative samples to stabilize training in a non‑stationary environment.
- 🤖 Agent Inference Paradigm Compatibility: At inference, Tongyi DeepResearch is compatible with two inference paradigms: ReAct, for rigorously evaluating the model's core intrinsic abilities, and an IterResearch-based 'Heavy' mode, which uses a test-time scaling strategy to unlock the model's maximum performance ceiling.
Model Download
You can directly download the model by following the links below.
| Model | Download Links | Model Size | Context Length | | :-------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------: | :------------: | | Tongyi-DeepResearch-30B-A3B | 🤗 HuggingFace<br> 🤖 ModelScope | 30B-A3B | 128K |
News
[2025/09/20]🚀 Tongyi-DeepResearch-30B-A3B is now on OpenRouter! Follow the Quick-start guide.
[2025/09/17]🔥 We have released Tongyi-DeepResearch-30B-A3B.
Deep Research Benchmark Results
<p align="center"> <img width="100%" src="./assets/benchmark.png"> </p>Quick Start
This guide provides instructions for setting up the environment and running inference scripts located in the inference folder.
1. Environment Setup
- Recommended Python version: 3.10.0 (using other versions may cause dependency issues).
- It is strongly advised to create an isolated environment using
condaorvirtualenv.
# Example with Conda
conda create -n react_infer_env python=3.10.0
conda activate react_infer_env
2. Installation
Install the required dependencies:
pip install -r requirements.txt
3. Environment Configuration and Prepare Evaluation Data
Environment Configuration
Configure your API keys and settings by copying the example environment file:
# Copy the example environment file
cp .env.example .env
Edit the .env file and provide your actual API keys and configuration values:
- SERPER_KEY_ID: Get your key from Serper.dev for web search and Google Scholar
- JINA_API_KEYS: Get your key from Jina.ai for web page reading
- API_KEY/API_BASE: OpenAI-compatible API for page summarization from OpenAI
- DASHSCOPE_API_KEY: Get your key from Dashscope for file parsing
- SANDBOX_FUSION_ENDPOINT: Python interpreter sandbox endpoints (see SandboxFusion)
- MODEL_PATH: Path to your model weights
- DATASET: Name of your evaluation dataset
- OUTPUT_PATH: Directory for saving results
Note: The
.envfile is gitignored, so your secrets will not be committed to the repository.
Prepare Evaluation Data
The system supports two input file formats: JSON and JSONL.
Supported File Formats:
Option 1: JSONL Format (recommended)
- Create your data file with
.jsonlextension (e.g.,my_questions.jsonl) - Each line must be a valid JSON object with
questionandanswerkeys:{"question": "What is the capital of France?", "answer": "Paris"} {"question": "Explain quantum computing", "answer": ""}
Option 2: JSON Format
- Create your data file with
.jsonextension (e.g.,my_questions.json) - File must contain a JSON array of objects, each with
questionandanswerkeys:[ { "question": "What is the capital of France?", "answer": "Paris" }, { "question": "Explain quantum computing", "answer": "" } ]
Important Note: The answer field contains the ground truth/reference answer used for evaluation. The system generates its own responses to the questions, and these reference answers are used to automatically judge the quality of the generated responses during benchmark evaluation.
File References for Document Processing:
- If using the file parser tool, prepend the filename to the
questionfield - Place referenced files in
eval_data/file_corpus/directory - Example:
{"question": "(Uploaded 1 file: ['report.pdf'])\n\nWhat are the key findings?", "answer": "..."}
File Organization:
project_root/
├── eval_data/
│ ├── my_questions.jsonl # Your evaluation data
│ └── file_corpus/ # Referenced documents
│ ├── report.pdf
│ └── data.xlsx
4. Configure the Inference Script
- Open
run_react_infer.shand modify the following variables as instructed in the comments:MODEL_PATH- path to the local or remote model weights.DATASET- full path to your evaluation file, e.g.eval_data/my_questions.jsonlor/path/to/my_questions.json.OUTPUT_PATH- path for saving the prediction results, e.g../outputs.
- Depending on the tools you enable (retrieval, calculator, web search, etc.), provide the required
API_KEY,BASE_URL, or other credentials. Each key is explained inline in the bash script.
5. Run the Inference Script
bash run_react_infer.sh
With these steps, you can fully prepare the environment, configure the dataset, and run the model. For more details, consult the inline
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
mentoring-juniors
Community-contributed instructions, agents, skills, and configurations to help you make the most of GitHub Copilot.
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
