SkillAgentSearch skills...

WebThinker

[NeurIPS 2025] 🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Install / Use

/learn @RUC-NLPIR/WebThinker
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<h1 align="center"> 🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability</a></h1> <div align="center">

Notion Paper Paper License Python 3.9+ X (formerly Twitter) URL

</div> <p align="center"> πŸ€— <a href="https://huggingface.co/lixiaoxi45/WebThinker-QwQ-32B" target="_blank">WebThinker-QwQ-32B</a> | πŸ€— <a href="https://huggingface.co/lixiaoxi45/WebThinker-R1-7B" target="_blank">WebThinker-R1-7B</a> | πŸ€— <a href="https://huggingface.co/lixiaoxi45/WebThinker-R1-14B" target="_blank">WebThinker-R1-14B</a> | πŸ€— <a href="https://huggingface.co/lixiaoxi45/WebThinker-R1-32B" target="_blank">WebThinker-R1-32B</a> </p> <h5 align="center"> If you like our project, please give us a star ⭐ on GitHub for the latest update.</h5>

πŸ“£ Latest News

πŸ”₯ Deep Research Agent Family

<details open><summary>Welcome to try our deep research agent series: </summary><p>

DeepAgent: A General Reasoning Agent with Scalable Toolsets (New!) <br> Authors: Xiaoxi Li, Wenxiang Jiao, Jiarui Jin, Guanting Dong, Jiajie Jin, Yinuo Wang, Hao Wang, Yutao Zhu, Ji-Rong Wen, Yuan Lu, Zhicheng Dou <br> TLDR: An end-to-end deep reasoning agent that performs autonomous thinking, tool discovery, and action execution with brain-inspired memory folding mechanism. <br> github github arXiv Paper

WebThinker: Empowering Large Reasoning Models with Deep Research Capability (NeurIPS 2025) <br> Authors: Xiaoxi Li*, Jiajie Jin*, Guanting Dong*, Hongjin Qian, Yutao Zhu, Yongkang Wu, Ji-Rong Wen, Zhicheng Dou <br> TLDR: A deep research agent that empowers large reasoning models with autonomous search, web browsing, and research report drafting capabilities. <br> github github arXiv Paper

Search-o1: Agentic Search-Enhanced Large Reasoning Models (EMNLP 2025) <br> Authors: Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, Zhicheng Dou <br> TLDR: An agentic search-enhanced framework that integrates autonomous knowledge retrieval with large reasoning models through Agentic RAG and reasoning-in-documents modules. <br> github github arXiv Paper Project Page

</p></details>

🎬 Demo

<div align="center"> <video src="https://github.com/user-attachments/assets/a38e82ec-5aed-4efe-a8b8-e9ee2d97e9b9" /> </div>

πŸ’‘ Overview

WebThinker is a deep research framework fully powered by large reasoning models (LRMs). WebThinker enables LRMs to autonomously search, deeply explore web pages, and draft research reports, all within their thinking process.

Unlike existing open-source deep search agents that typically employ retrieval-augmented generation (RAG) with predefined workflows, WebThinker allows the reasoning model itself to perform actions during thinking, achieving end-to-end task execution in a single generation.

πŸ“Š Overall Performance

<p align="center"> <img src="figures/performance.png" width="100%" /> </p>

As shown above, WebThinker consistently outperforms competing approaches on both knowledge-intensive complex reasoning benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and open-ended reasoning tasks for report generation. Our WebThinker-32B with QwQ-32B as backbone reasoning model achieves superior performance across all tasks.

✨ The WebThinker Framework

Model Comparison

WebThinker enables reasoning models to autonomously conduct web searches and web page navigations to acquire external knowledge during their reasoning process. This approach significantly reduces the time and costs associated with information gathering for researchers in knowledge-intensive fields. Furthermore, WebThinker allows LRMs to draft section content while thinking and searching, producing comprehensive, customized reports that directly address users' research questions.

Key Features:

  • We introduce a Deep Web Explorer that empowers LRMs to search, navigate pages by clicking interactive elements (like links or buttons), and extract relevant information. Based on initial search results, the LRM can initiate follow-up searches and traverse deeper links until it collects all relevant information.
  • For scientific reporting, our Autonomous Think-Search-and-Draft strategy integrates real-time knowledge seeking with report creation. We equip LRMs with three specialized tools: (1) drafting content for specific chapters, (2) checking the current report, and (3) editing the reportβ€”ensuring reports remain comprehensive, coherent, and adaptive to new insights.
  • We're developing RL-based training strategies to optimize end-to-end task performance by leveraging large-scale reasoning trajectories from complex tasks. Using accuracy of reasoning, tool usage, and final outputs, we construct preference pairs for online DPO training, enabling the model to progressively improve its research capabilities.

πŸ”§ Installation

Environment Setup

# Create conda environment
conda create -n webthinker python=3.9
conda activate webthinker

# Install requirements
cd WebThinker-main
pip install -r requirements.txt

πŸƒ Quick Start

Pre-preparation

Model Serving

Before running WebThinker, ensure your reasoning model and auxiliary model are served using vLLM. In our experiments, we use QwQ-32B as the reasoning model and Qwen-32B-Instruct as the auxiliary model. You can also explore other instruction-tuned models as your auxiliary model, which will be used in webpage reading, report writting/editting, evaluation, etc. For detailed instructions on model serving, see here.

Web Parser Client

For better web crawling performance, we recommend setting up a web parser client in scripts/search/bing_search.py using Crawl4AI. This will help handle JavaScript-rendered content and provide more reliable webpage extraction.

Now you can run different inference modes using the provided scripts. Below are examples of how to execute each mode:

Problem Solv

Related Skills

View on GitHub
GitHub Stars1.4k
CategoryEducation
Updated1d ago
Forks138

Languages

Python

Security Score

100/100

Audited on Apr 3, 2026

No findings