<h1>NVIDIA AI-Q Blueprint</h1>

🏆 BENCHMARK NOTE 🏆

To obtain results consistent with the nvidia-aiq DeepResearch Bench and DeepResearch Bench II leaderboard results, please use the drb1 and drb2 branches, respectively.

Overview
Software Components
Target Audience
Prerequisites
Architecture
Getting Started
Configuration Files
Ways to Run the Agents
Evaluating the Workflow
- Available Benchmarks
- Running Evaluations
Development
Roadmap
Security Considerations
License

Overview

The NVIDIA AI-Q Blueprint is an enterprise-grade research agent built on the NVIDIA NeMo Agent Toolkit and uses LangChain Deep Agents. It gives you both quick, cited answers and in-depth, report-style research in one system, with benchmarks and evaluation harnesses so you can measure quality and improve over time.

Key features:

Orchestration node — One node classifies intent (meta vs. research), produces meta responses (for example, greetings, capabilities), and sets research depth (shallow vs. deep).
Shallow research — Bounded, faster researcher with tool-calling and source citation.
Deep research — Long-running multi-step planning and research to generate a long-form citation-backed report.
Workflow configuration — YAML configs define agents, tools, LLMs, and routing behavior so you can tune workflows without code changes.
Modular workflows — All agents (orchestration node, shallow researcher, deep researcher, clarifier) are composable; each can run standalone or as part of the full pipeline.
Evaluation harnesses — Built-in benchmarks (for example, FreshQA, DeepResearch) and evaluation scripts to measure quality and iterate on prompts and agent architecture.
Frontend options — Run through CLI, web UI, or async jobs; the Getting started and Ways to run the agents.
Deployment options - Deployment assets for a docker compose as well as helm deployment.

Software Components

The following are used by this project in the default configuration:

NVIDIA NeMo Agent Toolkit
NVIDIA nemotron-3-nano-30b-a3b (agents, researcher)
NVIDIA nemotron-3-super-120b-a12b (optional, compatible but Build API has limited availability due to high demand)
NVIDIA nemotron-3-nano-30b-a3b (intent classifier)
GPT-OSS-120B (agents)
NVIDIA nemotron-mini-4b-instruct (document summary, if used)
NVIDIA llama-nemotron-embed-vl-1b-v2 (embedding model for llamaindex knowledge layer implementation, if used)
NVIDIA nemotron-nano-12b-v2-vl (vision-language model for llamaindex knowledge layer implementation, if used)
Tavily Search API for web search
Serper Search API for paper search (Google Scholar)

Target Audience

This project is for:

AI researchers and developers: People building or extending agentic research workflows
Enterprise teams: Organizations needing tool-augmented research with citation-backed research
NeMo Agent Toolkit users: Developers looking to understand advanced multi-agent patterns

Prerequisites

Python 3.11–3.13
uv package manager
NVIDIA API key from NVIDIA AI (for NIM models)
Node.js 22+ and npm (optional, for web UI mode)

Dependency Note: This release is pinned to NeMo Agent Toolkit (NAT) v1.4.0 (nvidia-nat==1.4.0). NAT v1.5 or later is not yet supported by AI-Q and upgrading may introduce breaking changes. The pin will be lifted in a future AI-Q release once compatibility has been validated.

Optional requirements:

Tavily API key (for web search functionality)
Serper API key (for academic paper search functionality)

Note: Configure at least one data source (Tavily web search, Serper search tool, or knowledge layer) to enable research functionality.

If these optional API keys are not provided, the agent continues to operate without the corresponding search capabilities. Refer to Obtain API Keys for details.

Hardware Requirements

When using NVIDIA API Catalog (the default), inference runs on NVIDIA-hosted infrastructure and there are no local GPU requirements. The hardware references below apply only when self-hosting models via NVIDIA NIM.

| Component | Default Model | Self-Hosted Hardware Reference | |-----------|---------------|-------------------------------| | LLM (research subagent) | nvidia/nemotron-3-nano-30b-a3b (default) or nvidia/nemotron-3-super-120b-a12b (optional) | Nemotron 3 Nano support matrix, Nemotron 3 Super support matrix | | LLM (intent classifier) | nvidia/nemotron-3-nano-30b-a3b | Nemotron 3 Nano support matrix | | LLM (deep research orchestrator, planner) | openai/gpt-oss-120b | GPT OSS support matrix | | Document summary (optional) | nvidia/nemotron-mini-4b-instruct | Nemotron Mini 4B | | Text embedding | nvidia/llama-nemotron-embed-vl-1b-v2 | NeMo Retriever embedding support matrix | | VLM (image/chart extraction, optional) | nvidia/nemotron-nano-12b-v2-vl | Vision language model support matrix | | Knowledge layer (Foundational RAG, optional) | -- | RAG Blueprint support matrix |

For detailed installation instructions, refer to Installation -- Hardware Requirements.

Architecture

AI-Q uses a LangGraph-based state machine with the following key components:

Orchestration node: Classifies intent (meta vs. research), produces meta responses when needed, and sets depth (shallow vs. deep) in one step
Shallow research agent: Bounded tool-augmented research optimized for speed
Deep research agent: Multi-phase research with planning, iteration, and citation management

Each agent can be run individually or as part of the orchestrated workflow. For detailed architecture documentation, refer to Architecture.

Getting Started

Clone the Repository

git clone https://github.com/NVIDIA-AI-Blueprints/aiq.git && cd aiq

Automated Setup

Run the setup script to initialize the environment:

./scripts/setup.sh

This script:

Creates a Python virtual environment with uv
Installs all Python dependencies (core, frontends, benchmarks, data sources)
Installs UI dependencies (if Node.js is available)

Manual Installation

For selective installation, install packages individually:

# Create and activate virtual environment
uv venv --python 3.13 .venv
source .venv/bin/activate

# Install core with development dependencies
uv pip install -e "

Aiq

Install / Use

README

Table of Contents