SkillAgentSearch skills...

Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

Install / Use

/learn @NVIDIA-NeMo/Guardrails
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

NeMo Guardrails

License PyPI PyPI - Python Version Tests/Linux Tests/Windows Tests/macOS Lint Code style: black Documentation arXiv Downloads Downloads

LATEST RELEASE / DEVELOPMENT VERSION: The main branch tracks the latest released beta version: 0.21.0. For the latest development version, checkout the develop branch.

✨✨✨

📌 The official NeMo Guardrails documentation has moved to docs.nvidia.com/nemo/guardrails.

✨✨✨

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational applications. Guardrails (or "rails" for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more.

This paper introduces NeMo Guardrails and contains a technical overview of the system and the current evaluation.

Requirements

Python 3.10, 3.11, 3.12 or 3.13.

NeMo Guardrails uses annoy which is a C++ library with Python bindings. To install NeMo Guardrails you will need to have the C++ compiler and dev tools installed. Check out the Installation Guide for platform-specific instructions.

Installation

To install using pip:

> pip install nemoguardrails

For more detailed instructions, see the Installation Guide.

Overview

<!-- start-documentation-reuse -->

NeMo Guardrails enables developers building LLM-based applications to easily add programmable guardrails between the application code and the LLM.

<div align="center"> <img src="https://github.com/NVIDIA-NeMo/Guardrails/raw/develop/docs/_static/images/programmable_guardrails.png" width="75%" alt="Programmable Guardrails"> </div>

Key benefits of adding programmable guardrails include:

  • Building Trustworthy, Safe, and Secure LLM-based Applications: you can define rails to guide and safeguard conversations; you can choose to define the behavior of your LLM-based application on specific topics and prevent it from engaging in discussions on unwanted topics.

  • Connecting models, chains, and other services securely: you can connect an LLM to other services (a.k.a. tools) seamlessly and securely.

  • Controllable dialog: you can steer the LLM to follow pre-defined conversational paths, allowing you to design the interaction following conversation design best practices and enforce standard operating procedures (e.g., authentication, support).

<!-- end-documentation-reuse -->

Protecting against LLM Vulnerabilities

NeMo Guardrails provides several mechanisms for protecting an LLM-powered chat application against common LLM vulnerabilities, such as jailbreaks and prompt injections. Below is a sample overview of the protection offered by different guardrails configuration for the example ABC Bot included in this repository. For more details, please refer to the LLM Vulnerability Scanning page.

<div align="center"> <img src="https://github.com/NVIDIA-NeMo/Guardrails/raw/develop/docs/_static/images/abc-llm-vulnerability-scan-results.png" width="500"> </div>

Use Cases

You can use programmable guardrails in different types of use cases:

  1. Question Answering over a set of documents (a.k.a. Retrieval Augmented Generation): Enforce fact-checking and output moderation.
  2. Domain-specific Assistants (a.k.a. chatbots): Ensure the assistant stays on topic and follows the designed conversational flows.
  3. LLM Endpoints: Add guardrails to your custom LLM for safer customer interaction.
  4. LangChain Chains: If you use LangChain for any use case, you can add a guardrails layer around your chains.

Usage

To add programmable guardrails to your application you can use the Python API or a guardrails server (see the Server Guide for more details). Using the Python API is similar to using the LLM directly. Calling the guardrails layer instead of the LLM requires only minimal changes to the code base, and it involves two simple steps:

  1. Loading a guardrails configuration and creating an LLMRails instance.
  2. Making the calls to the LLM using the generate/generate_async methods.
from nemoguardrails import LLMRails, RailsConfig

# Load a guardrails configuration from the specified path.
config = RailsConfig.from_path("PATH/TO/CONFIG")
rails = LLMRails(config)

completion = rails.generate(
    messages=[{"role": "user", "content": "Hello world!"}]
)

Sample output:

{"role": "assistant", "content": "Hi! How can I help you?"}

The input and output format for the generate method is similar to the Chat Completions API from OpenAI.

Async API

NeMo Guardrails is an async-first toolkit as the core mechanics are implemented using the Python async model. The public methods have both a sync and an async version. For example: LLMRails.generate and LLMRails.generate_async.

Supported LLMs

You can use NeMo Guardrails with multiple LLMs like OpenAI GPT-3.5, GPT-4, LLaMa-2, Falcon, Vicuna, or Mosaic. For more details, check out the Supported LLM Models section in the Configuration Guide.

Types of Guardrails

NeMo Guardrails supports five main types of guardrails:

<div align="center"> <img src="https://github.com/NVIDIA-NeMo/Guardrails/raw/develop/docs/_static/images/programmable_guardrails_flow.png" width="75%" alt="Programmable Guardrails Flow"> </div>
  1. Input rails: applied to the input from the user; an input rail can reject the input, stopping any additional processing, or alter the input (e.g., to mask potentially sensitive data, to rephrase).

  2. Dialog rails: influence how the LLM is prompted; dialog rails operate on canonical form messages for details see Colang Guide) and determine if an action should be executed, if the LLM should be invoked to generate the next step or a response, if a predefined response should be used instead, etc.

  3. Retrieval rails: applied to the retrieved chunks in the case of a RAG (Retrieval Augmented Generation) scenario; a retrieval rail can reject a chunk, preventing it from being used to prompt the LLM, or alter the relevant chunks (e.g., to mask potentially sensitive data).

  4. Execution rails: applied to input/output of the custom actions (a.k.a. tools), that need to be called by the LLM.

  5. Output rails: applied to the output generated by the LLM; an output rail can reject the output, preventing it from being returned to the user, or alter it (e.g., removing sensitive data).

Guardrails Configuration

A guardrails configuration defines the LLM(s) to be used and one or more guardrails. A guardrails configuration can include any number of input/dialog/output/retrieval/execution rails. A configuration without any configured rails will essentially forward the requests to the LLM.

The standard structure for a guardrails configuration folder looks like this:

.
├── config
│   ├── actions.py
│   ├── config.py
│   ├── config.yml
│   ├── rails.co
│   ├── ...

The config.yml contains all the general configuration options, such as LLM models, active rails, and custom configuration data". The config.py file contains any custom initialization code and the actions.py contains any custom python actions. For a complete overview, see the Configuration Guide.

Below is an example config.yml:

`

Related Skills

View on GitHub
GitHub Stars5.9k
CategoryDevelopment
Updated1h ago
Forks636

Languages

Python

Security Score

85/100

Audited on Mar 28, 2026

No findings