NVIDIA Nemotron Developer Repository

Open and efficient models for agentic AI. Training recipes, deployment guides, and use-case examples for the Nemotron family.

Watch: Nemotron Overview

</div>

🎉Nemotron 3 Ultra was announced at GTC San Jose 2026. To learn more, see the usage guide!

Why Nemotron?

| | | |---|---| | Open Models | Fully transparent training data, techniques, and weights for community innovation | | Compute Efficiency | Model pruning and optimization enabling higher throughput via TensorRT-LLM | | High Accuracy | Built on frontier open models with human-aligned reasoning for agentic workflows | | Flexible Deployment | Deploy anywhere: edge, single GPU, or data center with NIM microservices |

Repository Overview

nemotron/
│
├── src/nemotron/recipes/    Training recipes (complete, reproducible pipelines)
│
├── usage-cookbook/          Usage cookbooks (deployment and model usage guides)
│
└── use-case-examples/       Examples of leveraging Nemotron in agentic workflows

Which section should I use?

| | Training Recipes | Usage Cookbooks | Use Case Examples | |---|---|---|---| | Purpose | Reproduce full training pipelines from raw data to model | Deploy and use trained models | Build end-to-end applications | | Format | Python packages with configs, scripts, and evaluation | Jupyter notebooks with step-by-step guides | Jupyter notebooks and scripts | | When to use | You want to train, fine-tune, or understand how a model was built | You have a model and want to deploy or run inference | You want to build an application (RAG, agents, tool use) | | Location | src/nemotron/recipes/ | usage-cookbook/ | use-case-examples/ |

What is Nemotron?

NVIDIA Nemotron is a family of open, high-efficiency multimodal models purpose-built for agentic AI.

Model Tiers:

Nano — Optimized for edge and PC deployments
Super — Single GPU deployment with highest throughput
Ultra — Multi-GPU datacenter applications

Nemotron models excel at coding, math, scientific reasoning, tool calling, instruction following, and visual reasoning. Deploy across edge, single GPU, or data center environments with support for NeMo, TensorRT-LLM, vLLM, SGLang, and NIM microservices.

Training Recipes

The Nemotron respository provides reproducible training pipelines from raw data to deployment-ready models. These implementations reflect how large language models are actually trained: careful experimentation, validation gates, and systematic optimization.

Why Complete Pipelines?

Training a production model involves interconnected components. Isolated examples miss how stages interact. Complete pipelines show:

How data quality affects downstream performance across pretraining, SFT, and RL
Which training techniques actually work together, not just in theory
Where validation gates prevent failures and maintain reproducibility
How to balance competing objectives across stages

Because these are complete systems, you can extract specific techniques with confidence. Each component has been proven to work in context.

Each Recipe Includes

🎨 Synthetic Data Generation - Scripts to generate synthetic datasets using NVIDIA-NeMo/DataDesigner
🗂️ Data Curation - Scripts to prepare training data using NVIDIA NeMo Curator for scalable data processing, filtering, and quality enhancement
🔁 Training - Complete training loops with hyperparameters using:
- NVIDIA-NeMo/Megatron-Bridge for Megatron models
- NVIDIA-NeMo/Automodel for HuggingFace models
- NVIDIA-NeMo/NeMo-RL when RL is needed
- Includes GPU-accelerated last-mile data processing (tokenization + optional sequence packing) for optimal training efficiency
📊 Evaluation - Benchmark evaluation on standard suites using NVIDIA NeMo Evaluator
📖 Documentation - Detailed explanations of each stage

Available Recipes

| Model | Description | Stages | Guide | |-------|-------------|--------|-------| | Nemotron 3 Super | 120.6B total / 12.7B active Hybrid Mamba Latent MoE Transformer for frontier reasoning, coding, and agentic tasks | Pretrain → SFT → RL | Training Guide | | Nemotron 3 Nano | 31.6B total / 3.6B active MoE Hybrid Mamba-Transformer for agentic reasoning | Pretrain → SFT → RL | Training Guide |

Nemotron 3 Super

A complete training recipe for the frontier Hybrid Mamba Latent Mixture-of-Experts Transformer model with state-of-the-art reasoning, coding, and agentic capabilities.

Open-Source Data Only: These recipes train exclusively on the open-sourced subset of training data. Results will differ from the tech report benchmarks, which used additional proprietary data. Use these recipes as reference implementations to apply the methodology with your own data.

Model Specifications:

120B total / 12B active parameters
Multi-stage RL pipeline: 3× RLVR + 2× SWE-RL + RLHF across 21 reward environments
Asynchronous GRPO with decoupled training and inference

What You Can Extract:

Large-scale pretraining with data curriculum
Multi-domain SFT pipeline
Multi-environment RLVR with 21 simultaneous reward environments
SWE-RL with container-isolated sandbox execution
GenRM-based RLHF with principle-following rewards
Asynchronous GRPO at 1K GPU scale

Resources:

Nemotron 3 Nano

A complete training recipe for the open, efficient Mixture-of-Experts hybrid Mamba-Transformer model optimized for agentic reasoning.

Open-Source Data Only: These recipes train exclusively on the open-sourced subset of training data. Results will differ from the tech report benchmarks, which used additional proprietary data. Use these recipes as reference implementations to apply the methodology with your own data.

Model Specifications:

31.6B total parameters, 3.6B active per forward pass
25 trillion pretraining tokens with curriculum learning
Up to 1M context length
3.3x higher inference throughput than similarly sized models

What You Can Extract:

Curriculum-based pretraining with two-phase data mixture
Long-context extension via CPT methodology
Multi-domain SFT with 12+ data sources
InfinityByte cross-domain code synthesis
Tool-calling fine-tuning and budget-controlled reasoning
Multi-environment RLVR with GRPO
GenRM reward modeling with circular comparison
DPO for tool hallucination reduction

Resources:

Usage Cookbooks

Practical deployment and model usage guides for Nemotron models.

| Model | Best For | Key Features | Resources | |-------|----------|--------------|-----------| | Nemotron 3 Super 120B A12B | Production deployments needing strong reasoning | 1M context, in NVFP4 single B200, RAG & tool calling | Cookbooks | | Nemotron 3 Nano 30B A3B | Resource-constrained environments | 1M context, sparse MoE hybrid Mamba-2, controllable reasoning | Cookbooks | | NVIDIA-Nemotron-Nano-12B-v2-VL | Document intelligence and video understanding | 12B VLM, video reasoning, Efficient Video Sampling | Cookbooks | | Llama-3.1-Nemotron-Safety-Guard-8B-v3 | Multilingual content moderation | 9 languages, 23 safety categories | Cookbooks | | Nemotron-Parse | Document parsing for RAG and AI agents | Table extraction, semantic segmentation | Cookbooks |

Use Case Examples

End-to-end examples demonstrating practical applications in the use-case-examples/ directory:

Nemotron

Install / Use

README

NVIDIA Nemotron Developer Repository

Why Nemotron?

Repository Overview

Which section should I use?

What is Nemotron?

Training Recipes

Why Complete Pipelines?

Each Recipe Includes

Available Recipes

Nemotron 3 Super

Nemotron 3 Nano

Usage Cookbooks

Use Case Examples