Aiq
The AI-Q NVIDIA Blueprint is an open reference example for building intelligent AI agents that connect to your enterprise data, reason using state-of-the-art models, and deliver trusted business insights.
Install / Use
/learn @NVIDIA-AI-Blueprints/AiqREADME
🏆 BENCHMARK NOTE 🏆
To obtain results consistent with the nvidia-aiq DeepResearch Bench and DeepResearch Bench II leaderboard results, please use the
drb1anddrb2branches, respectively.
Table of Contents
- Overview
- Software Components
- Target Audience
- Prerequisites
- Architecture
- Getting Started
- Configuration Files
- Ways to Run the Agents
- Evaluating the Workflow
- Development
- Roadmap
- Security Considerations
- License
Overview
The NVIDIA AI-Q Blueprint is an enterprise-grade research agent built on the NVIDIA NeMo Agent Toolkit and uses LangChain Deep Agents. It gives you both quick, cited answers and in-depth, report-style research in one system, with benchmarks and evaluation harnesses so you can measure quality and improve over time.
<p align="center"> <img src="./docs/assets/AIQ-arch-light.png" alt="AI-Q Architecture" width="800"> </p>Key features:
- Orchestration node — One node classifies intent (meta vs. research), produces meta responses (for example, greetings, capabilities), and sets research depth (shallow vs. deep).
- Shallow research — Bounded, faster researcher with tool-calling and source citation.
- Deep research — Long-running multi-step planning and research to generate a long-form citation-backed report.
- Workflow configuration — YAML configs define agents, tools, LLMs, and routing behavior so you can tune workflows without code changes.
- Modular workflows — All agents (orchestration node, shallow researcher, deep researcher, clarifier) are composable; each can run standalone or as part of the full pipeline.
- Evaluation harnesses — Built-in benchmarks (for example, FreshQA, DeepResearch) and evaluation scripts to measure quality and iterate on prompts and agent architecture.
- Frontend options — Run through CLI, web UI, or async jobs; the Getting started and Ways to run the agents.
- Deployment options - Deployment assets for a docker compose as well as helm deployment.
Software Components
The following are used by this project in the default configuration:
- NVIDIA NeMo Agent Toolkit
- NVIDIA nemotron-3-nano-30b-a3b (agents, researcher)
- NVIDIA nemotron-3-super-120b-a12b (optional, compatible but Build API has limited availability due to high demand)
- NVIDIA nemotron-3-nano-30b-a3b (intent classifier)
- GPT-OSS-120B (agents)
- NVIDIA nemotron-mini-4b-instruct (document summary, if used)
- NVIDIA llama-nemotron-embed-vl-1b-v2 (embedding model for llamaindex knowledge layer implementation, if used)
- NVIDIA nemotron-nano-12b-v2-vl (vision-language model for llamaindex knowledge layer implementation, if used)
- Tavily Search API for web search
- Serper Search API for paper search (Google Scholar)
Target Audience
This project is for:
- AI researchers and developers: People building or extending agentic research workflows
- Enterprise teams: Organizations needing tool-augmented research with citation-backed research
- NeMo Agent Toolkit users: Developers looking to understand advanced multi-agent patterns
Prerequisites
- Python 3.11–3.13
- uv package manager
- NVIDIA API key from NVIDIA AI (for NIM models)
- Node.js 22+ and npm (optional, for web UI mode)
Dependency Note: This release is pinned to NeMo Agent Toolkit (NAT) v1.4.0 (nvidia-nat==1.4.0). NAT v1.5 or later is not yet supported by AI-Q and upgrading may introduce breaking changes. The pin will be lifted in a future AI-Q release once compatibility has been validated.
Optional requirements:
- Tavily API key (for web search functionality)
- Serper API key (for academic paper search functionality)
Note: Configure at least one data source (Tavily web search, Serper search tool, or knowledge layer) to enable research functionality.
If these optional API keys are not provided, the agent continues to operate without the corresponding search capabilities. Refer to Obtain API Keys for details.
Hardware Requirements
When using NVIDIA API Catalog (the default), inference runs on NVIDIA-hosted infrastructure and there are no local GPU requirements. The hardware references below apply only when self-hosting models via NVIDIA NIM.
| Component | Default Model | Self-Hosted Hardware Reference |
|-----------|---------------|-------------------------------|
| LLM (research subagent) | nvidia/nemotron-3-nano-30b-a3b (default) or nvidia/nemotron-3-super-120b-a12b (optional) | Nemotron 3 Nano support matrix, Nemotron 3 Super support matrix |
| LLM (intent classifier) | nvidia/nemotron-3-nano-30b-a3b | Nemotron 3 Nano support matrix |
| LLM (deep research orchestrator, planner) | openai/gpt-oss-120b | GPT OSS support matrix |
| Document summary (optional) | nvidia/nemotron-mini-4b-instruct | Nemotron Mini 4B |
| Text embedding | nvidia/llama-nemotron-embed-vl-1b-v2 | NeMo Retriever embedding support matrix |
| VLM (image/chart extraction, optional) | nvidia/nemotron-nano-12b-v2-vl | Vision language model support matrix |
| Knowledge layer (Foundational RAG, optional) | -- | RAG Blueprint support matrix |
For detailed installation instructions, refer to Installation -- Hardware Requirements.
Architecture
AI-Q uses a LangGraph-based state machine with the following key components:
- Orchestration node: Classifies intent (meta vs. research), produces meta responses when needed, and sets depth (shallow vs. deep) in one step
- Shallow research agent: Bounded tool-augmented research optimized for speed
- Deep research agent: Multi-phase research with planning, iteration, and citation management
Each agent can be run individually or as part of the orchestrated workflow. For detailed architecture documentation, refer to Architecture.
Getting Started
Clone the Repository
git clone https://github.com/NVIDIA-AI-Blueprints/aiq.git && cd aiq
Automated Setup
Run the setup script to initialize the environment:
./scripts/setup.sh
This script:
- Creates a Python virtual environment with uv
- Installs all Python dependencies (core, frontends, benchmarks, data sources)
- Installs UI dependencies (if Node.js is available)
Manual Installation
For selective installation, install packages individually:
# Create and activate virtual environment
uv venv --python 3.13 .venv
source .venv/bin/activate
# Install core with development dependencies
uv pip install -e "
