Cloakpipe

Privacy middleware for LLM & RAG pipelines - consistent pseudonymization, encrypted vault, SSE streaming rehydration.

Generate Convert Improve

Install / Use

/learn @rohansx/Cloakpipe

About this skill

Quality Score

0/100

README

🔒 CloakPipe

Privacy proxy for LLM traffic. Detect, mask, and unmask PII in real-time.

Rust-native · <5ms latency · 30+ entity types · OpenAI-compatible · Local-first

Website · Docs · Cloud Dashboard · Discord

</div>

What is CloakPipe?

CloakPipe is a high-performance privacy proxy that sits between your application and any LLM API. It detects PII (personally identifiable information) in your prompts, replaces it with safe tokens, forwards the sanitized request to the LLM, and restores the original values in the response.

The LLM never sees your real data. Your users see natural responses.

Your App  ──▶  CloakPipe  ──▶  OpenAI / Anthropic / Any LLM
                  │
          Detect → Mask → Proxy → Unmask
                  │
           Encrypted Vault
          (AES-256-GCM)

Quick Start

Docker (recommended)

# Start CloakPipe
docker run -p 3100:3100 ghcr.io/cloakpipe/cloakpipe:latest

# Point your OpenAI SDK at CloakPipe
export OPENAI_BASE_URL=http://localhost:3100/v1

# Done. All LLM calls now go through CloakPipe.

Binary

# Install via cargo
cargo install cloakpipe

# Or download the latest release
curl -fsSL https://cloakpipe.co/install.sh | sh

# Start the proxy
cloakpipe serve --port 3100

Verify it works

curl http://localhost:3100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Summarize the case for Rajesh Singh, Aadhaar 2345 6789 0123, treated at Apollo Hospital Mumbai."}
    ]
  }'

# CloakPipe logs:
# ✓ Detected 3 entities: PERSON, AADHAAR, ORGANIZATION
# ✓ Masked: Rajesh Singh → PERSON_042, 2345 6789 0123 → AADHAAR_017, Apollo Hospital Mumbai → ORG_003
# ✓ Proxied to api.openai.com (sanitized)
# ✓ Unmasked response: PERSON_042 → Rajesh Singh (restored)

Before & After

What your app sends:

Summarize the medical history of Dr. Rajesh Singh (Aadhaar: 2345 6789 0123), treated at Apollo Hospital Mumbai for cardiac issues since March 2024.

What the LLM sees:

Summarize the medical history of PERSON_042 (Aadhaar: AADHAAR_017), treated at ORG_003 for cardiac issues since DATE_012.

What your user gets back:

Dr. Rajesh Singh has been under cardiac care at Apollo Hospital Mumbai since March 2024. The treatment history includes...

The LLM generates a coherent response using the tokens. CloakPipe restores the original values before returning to your app. The model never saw the real data.

Why CloakPipe?

| | CloakPipe | Presidio | Protecto | LLMGuard | |---|---|---|---|---| | Language | Rust | Python | Python | Python | | Latency | <5ms | 50–200ms | 50–200ms | 50–200ms | | Mode | Drop-in proxy | Library | Cloud SaaS | Library | | Reversible masking | ✅ Encrypted vault | ❌ Permanent redaction | ✅ Cloud vault | ❌ Permanent | | India PII | ✅ Aadhaar, PAN, UPI | ❌ | Partial | ❌ | | Self-hosted | ✅ Single binary | ✅ | Partial | ✅ | | MCP support | ✅ (via Cloud) | ❌ | ❌ | ❌ | | Price | Free (open source) | Free | $$$$ | Free | | Dependencies | 0 (single binary) | Python + spaCy | Python + cloud | Python + PyTorch |

How It Works

Detection Pipeline

CloakPipe uses a three-layer detection system for speed and accuracy:

Input Text
    │
    ▼
┌─────────────────────────────────────┐
│  Layer 1: Regex Pre-Filter          │  <1ms
│  Aadhaar, PAN, email, phone,       │
│  credit card, SSN, IP address       │
│  Catches ~60% of PII instantly      │
├─────────────────────────────────────┤
│  Layer 2: ONNX NER Model           │  ~3ms
│  GLiNER2 transformer-based NER     │
│  Context-aware: names, orgs,       │
│  medical terms, addresses           │
├─────────────────────────────────────┤
│  Layer 3: Fuzzy Entity Resolution   │  <1ms
│  Jaro-Winkler similarity matching  │
│  Links "Dr. R. Singh" and          │
│  "Rajesh Singh" as same entity      │
└─────────────────────────────────────┘
    │
    ▼
Masked Output (total: <5ms)

Tokenization

Tokens are deterministic within a session — the same entity always maps to the same token. This means the LLM maintains coherence across the conversation.

Tokens are non-deterministic across sessions — the same entity maps to a different token in a new session, preventing cross-session correlation.

Encrypted Vault

All entity ↔ token mappings are stored in a local vault encrypted with AES-256-GCM. The vault never leaves your infrastructure. There is no cloud dependency.

Supported Entity Types

Standard PII

| Entity | Example | Detection | |---|---|---| | Person Name | John Smith, Dr. Priya Sharma | NER | | Email Address | user@example.com | Regex | | Phone Number | +1-555-0123, +91 98765 43210 | Regex | | Credit Card | 4532-1234-5678-9012 | Regex + Luhn | | SSN | 123-45-6789 | Regex | | Date of Birth | 15/03/1990, March 15, 1990 | NER | | Address | 123 MG Road, Pune 411001 | NER | | IP Address | 192.168.1.1, 2001:db8::1 | Regex | | Organization | Apollo Hospital, HDFC Bank | NER | | Medical Term | diabetes, cardiac arrest | NER | | Bank Account | IFSC + account number | Regex | | Passport Number | J1234567 | Regex | | License Plate | MH 12 AB 1234 | Regex | | URL | https://internal.company.com | Regex | | API Key | sk-live_xxx, AKIA... | Regex |

India-Specific PII 🇮🇳

| Entity | Format | Example | |---|---|---| | Aadhaar Number | 12 digits (XXXX XXXX XXXX) | 2345 6789 0123 | | PAN Card | ABCDE1234F | BNZPM2501F | | UPI ID | name@bank | rajesh@okicici | | Indian Phone | +91 XXXXX XXXXX | +91 98765 43210 | | GSTIN | 15-char alphanumeric | 27AAPFU0939F1ZV | | Indian Passport | Letter + 7 digits | J1234567 |

No other open-source LLM privacy tool handles Indian PII natively.

Integration Examples

OpenAI Python SDK

from openai import OpenAI

# Just change the base URL. That's it.
client = OpenAI(
    base_url="http://localhost:3100/v1",  # CloakPipe proxy
    api_key="sk-your-openai-key"          # Your real API key
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Analyze the account for Priya Sharma, PAN BNZPM2501F"}
    ]
)

# CloakPipe detected PAN and person name, masked them,
# sent sanitized prompt to OpenAI, and unmasked the response.
print(response.choices[0].message.content)

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4",
    openai_api_base="http://localhost:3100/v1",  # CloakPipe proxy
    openai_api_key="sk-your-key"
)

response = llm.invoke("Summarize patient records for Aadhaar 2345 6789 0123")

Anthropic SDK

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:3100/v1/anthropic",  # CloakPipe proxy
    api_key="sk-ant-your-key"
)

message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Review the loan application for Amit Patel, PAN ABCDE1234F"}
    ]
)

curl

# Works with any LLM API that uses the OpenAI format
curl http://localhost:3100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Your prompt with PII here"}]
  }'

Vercel AI SDK

import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

const result = await generateText({
  model: openai('gpt-4', {
    baseURL: 'http://localhost:3100/v1',  // CloakPipe proxy
  }),
  prompt: 'Analyze the customer data for Rajesh, Aadhaar 2345 6789 0123',
});

CLI

# Scan text for PII (no proxy, just detection)
cloakpipe scan "Dr. Rajesh Singh, Aadhaar 2345 6789 0123"
# Output:
# ✓ PERSON: "Dr. Rajesh Singh" (confidence: 0.97)
# ✓ AADHAAR: "2345 6789 0123" (confidence: 1.00)

# Mask text (replace PII with tokens)
cloakpipe mask "Contact Priya at priya@example.com or +91 98765 43210"
# Output: "Contact PERSON_001 at EMAIL_001 or PHONE_001"

# Start the proxy server
cloakpipe serve --port 3100

# Start with a specific policy
cloakpipe serve --port 3100 --policy policies/dpdp.yaml

# Check proxy health
cloakpipe health

Configuration

Environment Variables

# Proxy settings
CLOAKPIPE_PORT=3100                    # Proxy port (default: 3100)
CLOAKPIPE_HOST=0.0.0.0                # Bind address (default: 0.0.0.0)
CLOAKPIPE_LOG_LEVEL=info               # Log level: debug, info, warn, error

# LLM provider
CLOAKPIPE_UPSTREAM_URL=https://api.openai.com  # Default upstream LLM API
CLOAKPIPE_TIMEOUT=30                   # Request timeout in seconds

# Detection
CLOAKPIPE_POLICY=policies/dpdp.yaml   # Policy file path
CLOAKPIPE_MIN_CONFIDENCE=0.8          # Minimum NER confidence threshold (0.0–1.0)

# Vault
CLOAKPIPE_VAULT_PATH=./vault.db       # Encrypted vault file path
CLOAKPIPE_VAULT_KEY=                   # 256-bit encryption key (auto-generated if empty)

# Cloud (optional, for dashboard users)
CLOAKPIPE_CLOUD_TOKEN=                 # Cloud dashboard token (app.cloakpipe.co)

Policy Files

CloakPipe uses YAML policy files to configure detection behavior per compliance framework:

# policies/dpdp.yaml — India Digital Personal Data Protection Act

Related Skills

himalaya

328.6k

CLI to manage emails via IMAP/SMTP. Use `himalaya` to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).

node-connect

328.6k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

80.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

coding-agent

328.6k

Delegate coding tasks to Codex, Claude Code, or Pi agents via background process