Cloakpipe
Privacy middleware for LLM & RAG pipelines - consistent pseudonymization, encrypted vault, SSE streaming rehydration.
Install / Use
/learn @rohansx/CloakpipeREADME
🔒 CloakPipe
Privacy proxy for LLM traffic. Detect, mask, and unmask PII in real-time.
Rust-native · <5ms latency · 30+ entity types · OpenAI-compatible · Local-first
Website · Docs · Cloud Dashboard · Discord
</div>What is CloakPipe?
CloakPipe is a high-performance privacy proxy that sits between your application and any LLM API. It detects PII (personally identifiable information) in your prompts, replaces it with safe tokens, forwards the sanitized request to the LLM, and restores the original values in the response.
The LLM never sees your real data. Your users see natural responses.
Your App ──▶ CloakPipe ──▶ OpenAI / Anthropic / Any LLM
│
Detect → Mask → Proxy → Unmask
│
Encrypted Vault
(AES-256-GCM)
Quick Start
Docker (recommended)
# Start CloakPipe
docker run -p 3100:3100 ghcr.io/cloakpipe/cloakpipe:latest
# Point your OpenAI SDK at CloakPipe
export OPENAI_BASE_URL=http://localhost:3100/v1
# Done. All LLM calls now go through CloakPipe.
Binary
# Install via cargo
cargo install cloakpipe
# Or download the latest release
curl -fsSL https://cloakpipe.co/install.sh | sh
# Start the proxy
cloakpipe serve --port 3100
Verify it works
curl http://localhost:3100/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Summarize the case for Rajesh Singh, Aadhaar 2345 6789 0123, treated at Apollo Hospital Mumbai."}
]
}'
# CloakPipe logs:
# ✓ Detected 3 entities: PERSON, AADHAAR, ORGANIZATION
# ✓ Masked: Rajesh Singh → PERSON_042, 2345 6789 0123 → AADHAAR_017, Apollo Hospital Mumbai → ORG_003
# ✓ Proxied to api.openai.com (sanitized)
# ✓ Unmasked response: PERSON_042 → Rajesh Singh (restored)
Before & After
What your app sends:
Summarize the medical history of Dr. Rajesh Singh (Aadhaar: 2345 6789 0123), treated at Apollo Hospital Mumbai for cardiac issues since March 2024.
What the LLM sees:
Summarize the medical history of PERSON_042 (Aadhaar: AADHAAR_017), treated at ORG_003 for cardiac issues since DATE_012.
What your user gets back:
Dr. Rajesh Singh has been under cardiac care at Apollo Hospital Mumbai since March 2024. The treatment history includes...
The LLM generates a coherent response using the tokens. CloakPipe restores the original values before returning to your app. The model never saw the real data.
Why CloakPipe?
| | CloakPipe | Presidio | Protecto | LLMGuard | |---|---|---|---|---| | Language | Rust | Python | Python | Python | | Latency | <5ms | 50–200ms | 50–200ms | 50–200ms | | Mode | Drop-in proxy | Library | Cloud SaaS | Library | | Reversible masking | ✅ Encrypted vault | ❌ Permanent redaction | ✅ Cloud vault | ❌ Permanent | | India PII | ✅ Aadhaar, PAN, UPI | ❌ | Partial | ❌ | | Self-hosted | ✅ Single binary | ✅ | Partial | ✅ | | MCP support | ✅ (via Cloud) | ❌ | ❌ | ❌ | | Price | Free (open source) | Free | $$$$ | Free | | Dependencies | 0 (single binary) | Python + spaCy | Python + cloud | Python + PyTorch |
How It Works
Detection Pipeline
CloakPipe uses a three-layer detection system for speed and accuracy:
Input Text
│
▼
┌─────────────────────────────────────┐
│ Layer 1: Regex Pre-Filter │ <1ms
│ Aadhaar, PAN, email, phone, │
│ credit card, SSN, IP address │
│ Catches ~60% of PII instantly │
├─────────────────────────────────────┤
│ Layer 2: ONNX NER Model │ ~3ms
│ GLiNER2 transformer-based NER │
│ Context-aware: names, orgs, │
│ medical terms, addresses │
├─────────────────────────────────────┤
│ Layer 3: Fuzzy Entity Resolution │ <1ms
│ Jaro-Winkler similarity matching │
│ Links "Dr. R. Singh" and │
│ "Rajesh Singh" as same entity │
└─────────────────────────────────────┘
│
▼
Masked Output (total: <5ms)
Tokenization
Tokens are deterministic within a session — the same entity always maps to the same token. This means the LLM maintains coherence across the conversation.
Tokens are non-deterministic across sessions — the same entity maps to a different token in a new session, preventing cross-session correlation.
Encrypted Vault
All entity ↔ token mappings are stored in a local vault encrypted with AES-256-GCM. The vault never leaves your infrastructure. There is no cloud dependency.
Supported Entity Types
Standard PII
| Entity | Example | Detection | |---|---|---| | Person Name | John Smith, Dr. Priya Sharma | NER | | Email Address | user@example.com | Regex | | Phone Number | +1-555-0123, +91 98765 43210 | Regex | | Credit Card | 4532-1234-5678-9012 | Regex + Luhn | | SSN | 123-45-6789 | Regex | | Date of Birth | 15/03/1990, March 15, 1990 | NER | | Address | 123 MG Road, Pune 411001 | NER | | IP Address | 192.168.1.1, 2001:db8::1 | Regex | | Organization | Apollo Hospital, HDFC Bank | NER | | Medical Term | diabetes, cardiac arrest | NER | | Bank Account | IFSC + account number | Regex | | Passport Number | J1234567 | Regex | | License Plate | MH 12 AB 1234 | Regex | | URL | https://internal.company.com | Regex | | API Key | sk-live_xxx, AKIA... | Regex |
India-Specific PII 🇮🇳
| Entity | Format | Example | |---|---|---| | Aadhaar Number | 12 digits (XXXX XXXX XXXX) | 2345 6789 0123 | | PAN Card | ABCDE1234F | BNZPM2501F | | UPI ID | name@bank | rajesh@okicici | | Indian Phone | +91 XXXXX XXXXX | +91 98765 43210 | | GSTIN | 15-char alphanumeric | 27AAPFU0939F1ZV | | Indian Passport | Letter + 7 digits | J1234567 |
No other open-source LLM privacy tool handles Indian PII natively.
Integration Examples
OpenAI Python SDK
from openai import OpenAI
# Just change the base URL. That's it.
client = OpenAI(
base_url="http://localhost:3100/v1", # CloakPipe proxy
api_key="sk-your-openai-key" # Your real API key
)
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "user", "content": "Analyze the account for Priya Sharma, PAN BNZPM2501F"}
]
)
# CloakPipe detected PAN and person name, masked them,
# sent sanitized prompt to OpenAI, and unmasked the response.
print(response.choices[0].message.content)
LangChain
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4",
openai_api_base="http://localhost:3100/v1", # CloakPipe proxy
openai_api_key="sk-your-key"
)
response = llm.invoke("Summarize patient records for Aadhaar 2345 6789 0123")
Anthropic SDK
from anthropic import Anthropic
client = Anthropic(
base_url="http://localhost:3100/v1/anthropic", # CloakPipe proxy
api_key="sk-ant-your-key"
)
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Review the loan application for Amit Patel, PAN ABCDE1234F"}
]
)
curl
# Works with any LLM API that uses the OpenAI format
curl http://localhost:3100/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Your prompt with PII here"}]
}'
Vercel AI SDK
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';
const result = await generateText({
model: openai('gpt-4', {
baseURL: 'http://localhost:3100/v1', // CloakPipe proxy
}),
prompt: 'Analyze the customer data for Rajesh, Aadhaar 2345 6789 0123',
});
CLI
# Scan text for PII (no proxy, just detection)
cloakpipe scan "Dr. Rajesh Singh, Aadhaar 2345 6789 0123"
# Output:
# ✓ PERSON: "Dr. Rajesh Singh" (confidence: 0.97)
# ✓ AADHAAR: "2345 6789 0123" (confidence: 1.00)
# Mask text (replace PII with tokens)
cloakpipe mask "Contact Priya at priya@example.com or +91 98765 43210"
# Output: "Contact PERSON_001 at EMAIL_001 or PHONE_001"
# Start the proxy server
cloakpipe serve --port 3100
# Start with a specific policy
cloakpipe serve --port 3100 --policy policies/dpdp.yaml
# Check proxy health
cloakpipe health
Configuration
Environment Variables
# Proxy settings
CLOAKPIPE_PORT=3100 # Proxy port (default: 3100)
CLOAKPIPE_HOST=0.0.0.0 # Bind address (default: 0.0.0.0)
CLOAKPIPE_LOG_LEVEL=info # Log level: debug, info, warn, error
# LLM provider
CLOAKPIPE_UPSTREAM_URL=https://api.openai.com # Default upstream LLM API
CLOAKPIPE_TIMEOUT=30 # Request timeout in seconds
# Detection
CLOAKPIPE_POLICY=policies/dpdp.yaml # Policy file path
CLOAKPIPE_MIN_CONFIDENCE=0.8 # Minimum NER confidence threshold (0.0–1.0)
# Vault
CLOAKPIPE_VAULT_PATH=./vault.db # Encrypted vault file path
CLOAKPIPE_VAULT_KEY= # 256-bit encryption key (auto-generated if empty)
# Cloud (optional, for dashboard users)
CLOAKPIPE_CLOUD_TOKEN= # Cloud dashboard token (app.cloakpipe.co)
Policy Files
CloakPipe uses YAML policy files to configure detection behavior per compliance framework:
# policies/dpdp.yaml — India Digital Personal Data Protection Act
Related Skills
himalaya
328.6kCLI to manage emails via IMAP/SMTP. Use `himalaya` to list, read, write, reply, forward, search, and organize emails from the terminal. Supports multiple accounts and message composition with MML (MIME Meta Language).
node-connect
328.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
80.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
coding-agent
328.6kDelegate coding tasks to Codex, Claude Code, or Pi agents via background process
