SkillAgentSearch skills...

Tokenfirewall

Scalable LLM cost enforcement middleware for Node.js with budget protection and multi-provider support

Install / Use

/learn @Ruthwik000/Tokenfirewall
About this skill

Quality Score

0/100

Supported Platforms

Claude Code
Claude Desktop
Gemini CLI

README

TokenFirewall

Enterprise-grade LLM cost enforcement middleware for Node.js with automatic budget protection, intelligent model routing, and comprehensive multi-provider support.

npm version npm downloads License: MIT TypeScript

Overview

TokenFirewall is a production-ready middleware that automatically tracks and enforces budget limits for Large Language Model (LLM) API calls. It provides transparent cost monitoring, prevents budget overruns, intelligent model routing with automatic failover, and supports multiple providers through a unified interface.

Key Features

  • Never Exceed Your Budget - Automatically blocks API calls when spending limits are reached, preventing surprise bills
  • Zero Code Changes Required - Drop-in middleware that works with any LLM API without modifying your existing code
  • Automatic Failover - Intelligent router switches to backup models when primary fails, keeping your app running
  • Real-time Cost Tracking - See exactly how much each API call costs based on actual token usage
  • Multi-Provider Support - Works with OpenAI, Anthropic, Gemini, Grok, Kimi, and any custom LLM provider
  • Custom Model Support - Register your own models with custom pricing and context limits at runtime
  • Production Ready - Battle-tested with comprehensive error handling and edge case coverage
  • TypeScript Native - Full type safety with included definitions

What's New in v2.0.0

  • Intelligent Router - Automatic failover to backup models when API calls fail
  • 40+ Latest Models - GPT-5, Claude 4.5, Gemini 3, with accurate 2026 pricing
  • Dynamic Registration - Add custom models and pricing at runtime
  • Production Hardened - Comprehensive validation, error handling, and edge case coverage

Table of Contents


Installation

npm install tokenfirewall

Requirements:

  • Node.js >= 16.0.0
  • TypeScript >= 5.0.0 (for TypeScript projects)

Quick Start

const { createBudgetGuard, patchGlobalFetch } = require("tokenfirewall");

// Step 1: Set up budget protection
createBudgetGuard({
  monthlyLimit: 100,  // $100 USD
  mode: "block"       // Throw error when exceeded
});

// Step 2: Patch global fetch
patchGlobalFetch();

// Step 3: Use any LLM API normally
const response = await fetch("https://api.openai.com/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }]
  })
});

// Costs are automatically tracked and logged

Core Concepts

Budget Guard

The Budget Guard tracks spending and enforces limits in two modes:

  • Block Mode (mode: "block"): Throws an error when budget is exceeded, preventing the API call
  • Warn Mode (mode: "warn"): Logs a warning but allows the API call to proceed

Automatic Interception

TokenFirewall intercepts HTTP requests at the fetch level, automatically:

  1. Detecting LLM API responses
  2. Extracting token usage information
  3. Calculating costs based on provider pricing
  4. Tracking against your budget
  5. Logging usage details

Provider Adapters

Each LLM provider has a dedicated adapter that:

  • Detects provider-specific response formats
  • Normalizes token usage data
  • Applies correct pricing models

API Reference

Budget Management

createBudgetGuard(options)

Creates and configures a budget guard instance.

Parameters:

interface BudgetGuardOptions {
  monthlyLimit: number;           // Maximum spending limit in USD
  mode?: "block" | "warn";        // Enforcement mode (default: "block")
}

Example:

const { createBudgetGuard } = require("tokenfirewall");

// Block mode - strict enforcement
createBudgetGuard({
  monthlyLimit: 100,
  mode: "block"
});

// Warn mode - soft limits
createBudgetGuard({
  monthlyLimit: 500,
  mode: "warn"
});

getBudgetStatus()

Retrieves the current budget status and usage statistics.

Returns:

interface BudgetStatus {
  totalSpent: number;      // Total amount spent in USD
  limit: number;           // Monthly limit in USD
  remaining: number;       // Remaining budget in USD
  percentageUsed: number;  // Percentage of budget used (0-100)
}

Example:

const { getBudgetStatus } = require("tokenfirewall");

const status = getBudgetStatus();
if (status) {
  console.log(`Spent: $${status.totalSpent.toFixed(2)}`);
  console.log(`Remaining: $${status.remaining.toFixed(2)}`);
  console.log(`Usage: ${status.percentageUsed.toFixed(1)}%`);
}

resetBudget()

Resets the budget tracking to zero.

const { resetBudget } = require("tokenfirewall");

// Reset at the start of each month
resetBudget();

exportBudgetState() / importBudgetState(state)

Save and restore budget state for persistence.

const { exportBudgetState, importBudgetState } = require("tokenfirewall");
const fs = require("fs");

// Export state
const state = exportBudgetState();
fs.writeFileSync("budget.json", JSON.stringify(state));

// Import state
const savedState = JSON.parse(fs.readFileSync("budget.json"));
importBudgetState(savedState);

Interception

patchGlobalFetch()

Patches the global fetch function to intercept and track LLM API calls.

const { patchGlobalFetch } = require("tokenfirewall");

patchGlobalFetch();

// All subsequent fetch calls are intercepted

Model Discovery

listModels(options)

Lists available models from a provider with context limits and budget information.

Parameters:

interface ListModelsOptions {
  provider: string;                  // Provider name
  apiKey: string;                    // Provider API key
  baseURL?: string;                  // Custom API endpoint
  includeBudgetUsage?: boolean;      // Include budget usage %
}

Example:

const { listModels } = require("tokenfirewall");

const models = await listModels({
  provider: "openai",
  apiKey: process.env.OPENAI_API_KEY,
  includeBudgetUsage: true
});

models.forEach(model => {
  console.log(`${model.model}: ${model.contextLimit} tokens`);
});

Intelligent Model Router

The Model Router provides automatic retry and model switching on failures.

createModelRouter(options)

Creates and configures an intelligent model router.

Parameters:

interface ModelRouterOptions {
  strategy: "fallback" | "context" | "cost";  // Routing strategy
  fallbackMap?: Record<string, string[]>;     // Fallback model map
  maxRetries?: number;                        // Max retry attempts (default: 1)
}

Example:

const { createModelRouter, patchGlobalFetch } = require("tokenfirewall");

// Fallback strategy - use predefined fallback models
createModelRouter({
  strategy: "fallback",
  fallbackMap: {
    "gpt-4o": ["gpt-4o-mini", "gpt-3.5-turbo"],
    "claude-3-5-sonnet-20241022": ["claude-3-5-haiku-20241022"]
  },
  maxRetries: 2
});

patchGlobalFetch();

// API calls will automatically retry with fallback models on failure

Routing Strategies

1. Fallback Strategy - Uses predefined fallback map

  • Tries models in order from fallbackMap
  • Best for: Known model preferences, production resilience

2. Context Strategy - Upgrades to larger context window

  • Only triggers on context overflow errors
  • Selects model with larger context from same provider
  • Best for: Handling variable input sizes

3. Cost Strategy - Switches to cheaper model

  • Selects cheaper model from same provider
  • Best for: Cost optimization, rate limit handling

Error Detection

The router automatically detects and classifies failures:

  • rate_limit - HTTP 429 or rate limit errors
  • context_overflow - Context length exceeded errors
  • model_unavailable - HTTP 404 or model not found
  • access_denied - HTTP 403 or unauthorized
  • unknown - Other errors

disableModelRouter()

Disables the model router.

const { disableModelRouter } = require("tokenfirewall");

disableModelRouter();

Dynamic Model Registration

Register models with pricing and context limits at runtime.

registerModels(provider, models)

Bulk register models for a provider.

Parameters:

interface ModelConfig {
  name: string;                    // Model identifier
  contextLimit?: number;           // Context window size in tokens
  pricing?: {                      // Pricing per 1M tokens (USD)
    input: number;
    output: number;
  };
}

Example:

const { registerModels, createModelRouter } = require("tokenfirewall");

// Register custom models
registerModels("my-provider", [
  {
    name: "my-large-model",
    contextLimit: 200000,
    pricing: { input: 5.0, output: 15.0 }
  },
  {
    name: "my-small-model",
    contextLimit: 50000
View on GitHub
GitHub Stars23
CategoryCustomer
Updated8d ago
Forks3

Languages

TypeScript

Security Score

95/100

Audited on Mar 30, 2026

No findings