Superagent

Superagent protects your AI applications against prompt injections, data leaks, and harmful outputs. Embed safety directly into your app and prove compliance to your customers.

Generate Convert Improve

Install / Use

/learn @superagent-ai/Superagent

About this skill

Quality Score

0/100

README

<img src="logo.png" width="80" alt="Superagent" /> <h1 align="center">Superagent SDK</h1> Make your AI apps safe. <a href="https://superagent.sh">Website</a> · <a href="https://docs.superagent.sh">Docs</a> · <a href="https://discord.gg/spZ7MnqFT4">Discord</a> · <a href="https://huggingface.co/superagent-ai">HuggingFace</a> <img src="https://img.shields.io/badge/Y%20Combinator-Backed-orange" alt="Y Combinator" /> <img src="https://img.shields.io/github/stars/superagent-ai/superagent?style=social" alt="GitHub stars" /> <img src="https://img.shields.io/badge/license-MIT-blue" alt="MIT License" />

An open-source SDK for AI agent safety. Block prompt injections, redact PII and secrets, scan repositories for threats, and run red team scenarios against your agent.

Features

Guard

Detect and block prompt injections, malicious instructions, and unsafe tool calls at runtime.

TypeScript:

import { createClient } from "safety-agent";

const client = createClient();

const result = await client.guard({
  input: userMessage
});

if (result.classification === "block") {
  console.log("Blocked:", result.violation_types);
}

Python:

from safety_agent import create_client

client = create_client()

result = await client.guard(input=user_message)

if result.classification == "block":
    print("Blocked:", result.violation_types)

Redact

Remove PII, PHI, and secrets from text automatically.

TypeScript:

const result = await client.redact({
  input: "My email is john@example.com and SSN is 123-45-6789",
  model: "openai/gpt-4o-mini"
});

console.log(result.redacted);
// "My email is <EMAIL_REDACTED> and SSN is <SSN_REDACTED>"

Python:

result = await client.redact(
    input="My email is john@example.com and SSN is 123-45-6789",
    model="openai/gpt-4o-mini"
)

print(result.redacted)
# "My email is <EMAIL_REDACTED> and SSN is <SSN_REDACTED>"

Scan

Analyze repositories for AI agent-targeted attacks such as repo poisoning and malicious instructions.

TypeScript:

const result = await client.scan({
  repo: "https://github.com/user/repo"
});

console.log(result.result);  // Security report
console.log(`Cost: $${result.usage.cost.toFixed(4)}`);

Python:

result = await client.scan(repo="https://github.com/user/repo")

print(result.result)  # Security report
print(f"Cost: ${result.usage.cost:.4f}")

Test

Run red team scenarios against your production agent. (Coming soon)

const result = await client.test({
  endpoint: "https://your-agent.com/chat",
  scenarios: ["prompt_injection", "data_exfiltration"]
});

console.log(result.findings);  // Vulnerabilities discovered

Get Started

TypeScript:

npm install safety-agent

Python:

uv add safety-agent

Set your API key:

export SUPERAGENT_API_KEY=your-key

Integration Options

| Option | Description | Link | |--------|-------------|------| | TypeScript SDK | Embed guard, redact, and scan directly in your app | sdk/typescript | | Python SDK | Embed guard, redact, and scan directly in Python apps | sdk/python | | CLI | Command-line tool for testing and automation | cli | | MCP Server | Use with Claude Code and Claude Desktop | mcp |

Why Superagent SDK?

Works with any model — OpenAI, Anthropic, Google, Groq, Bedrock, and more
Open-weight models — Run Guard on your infrastructure with 50-100ms latency
Low latency — Optimized for runtime use
Open source — MIT license with full transparency

Open-Weight Models

Run Guard on your own infrastructure. No API calls, no data leaving your environment.

| Model | Parameters | Use Case | |-------|------------|----------| | superagent-guard-0.6b | 0.6B | Fast inference, edge deployment | | superagent-guard-1.7b | 1.7B | Balanced speed and accuracy | | superagent-guard-4b | 4B | Maximum accuracy |

GGUF versions for CPU: 0.6b-gguf · 1.7b-gguf · 4b-gguf

Resources

License

MIT

Related Skills

healthcheck

330.7k

Host security hardening and risk-tolerance configuration for OpenClaw deployments

prose

330.7k

OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.

Writing Hookify Rules

81.4k

This skill should be used when the user asks to "create a hookify rule", "write a hook rule", "configure hookify", "add a hookify rule", or needs guidance on hookify rule syntax and patterns.

Agent Development

81.4k

This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.

superagent-ai

View profile

View on GitHub

GitHub Stars6.5k

CategoryLegal

Updated4h ago

Forks959

superagent-ai/superagent

Languages

TypeScript

Security Score

100/100

Audited on Mar 23, 2026

No findings