FACT: Fast Augmented Context Tools

A revolutionary approach to LLM data retrieval that replaces RAG with prompt caching and deterministic tool execution under the Model Context Protocol

TL;DR

FACT (Fast Augmented Context Tools) introduces a new paradigm for language model–powered data retrieval by replacing vector-based retrieval with a prompt-and-tool approach under the Model Context Protocol (MCP). The result? Sub-100ms responses, 60-90% cost reduction, and deterministic, auditable results with no vector stores required.

Why FACT? RAG Had Its Moment. It's Time for Something Smarter.

RAG (Retrieval-Augmented Generation) made sense when vector search was the best we had. But vectors are slow, fuzzy, and expensive to maintain. They're inherently imprecise, forcing you to tune similarity thresholds, re-embed documents, and accept that relevance is always a bit of a guess.

What we needed was something explicit. Deterministic. Cheap. Fast.

FACT isn't about fetching similar chunks of data. It's about giving models structured, exact answers via tool execution and pairing that with intelligent prompt caching. Prompt caches work like brains with memory. Tools act like hands that do. And when you combine the two—prompt caching + MCP-based tools—you can skip vector search entirely.

Instead of saying "Find me something like this," FACT says: "Run this exact SQL call. Return this live API result. Use this schema. Cache the output."

Introduction to FACT

FACT (Fast Augmented Context Tools) introduces a new paradigm for language model–powered data retrieval by replacing vector-based retrieval with a prompt-and-tool approach under the Model Context Protocol (MCP). Instead of relying on embeddings and similarity searches, FACT combines intelligent prompt caching with deterministic tool invocation to deliver fresh, precise, and auditable results.

Key Differences from RAG

FACT represents a fundamental shift from traditional RAG (Retrieval-Augmented Generation) approaches:

Retrieval Mechanism

RAG: Embeddings → Vector search → LLM completion
FACT: Prompt cache → MCP tool calls → LLM refinement

Data Freshness

RAG: Periodic re-indexing required
FACT: Live data via on-demand tool execution

Accuracy

RAG: Probabilistic, fuzzy matches
FACT: Exact outputs from SQL, API, or custom tools

Cost & Latency

RAG: Embedding + lookup + token costs
FACT: Cache hits eliminate tokens; cache misses trigger fast tool calls

Core Architectural Innovation

Traditional RAG Approach:
User Query → Embedding → Vector Search → Context Retrieval → LLM → Response (2-5 seconds)

FACT MCP Approach:
User Query → Prompt Cache → [If Miss] → MCP Tool Execution → Cache Update → Response (50-200ms)

Agentic Engineering & Intelligent Caching

FACT enables agentic workflows where AI systems make intelligent decisions about data retrieval, caching, and tool execution in complex, multi-step processes. Unlike static vector databases that treat all data equally, FACT implements intelligent caching that understands the dynamic nature of different data types.

The Vector Problem with Dynamic Data

Vectors excel at static content that changes infrequently, but they're fundamentally ill-suited for:

Real-time data that changes moment-by-moment
Request-specific context that varies per user or session
Dynamic calculations that depend on current parameters
Time-sensitive information with specific TTL requirements

When data needs to change request-by-request with precise time-to-live characteristics, vectors are the worst possible choice.

Intelligent Cache Decision-Making

FACT's caching system makes sophisticated decisions about what to cache and when:

Cache Strategy Engine:
├── Static Content → Long-term cache (hours/days)
│   ├── System prompts and schemas
│   ├── Configuration data
│   └── Reference documentation
├── Semi-Dynamic → Medium-term cache (minutes/hours)  
│   ├── Database schemas
│   ├── User preferences
│   └── System metrics
└── Dynamic Content → Short-term cache (seconds/minutes)
    ├── Live API responses
    ├── Real-time calculations
    └── User-specific queries

Recursive Tool Execution & Feedback Loops

FACT supports complex agentic patterns:

Tool Chaining: Output from one tool becomes input for the next
Conditional Execution: Tools execute based on previous results
Feedback Loops: Systems learn from execution patterns to optimize caching
Self-Optimization: Cache strategies adapt based on usage patterns

What Makes FACT Different

1. Intelligent Cache-First Design Philosophy

FACT leverages Claude's native caching with intelligent decision-making to store and reuse responses automatically, eliminating the need for complex vector databases or RAG systems:

Context-Aware Caching: System determines optimal cache duration based on data type
Adaptive TTL Management: Cache expiration varies by content volatility
Smart Invalidation: Proactive cache updates based on data change patterns
Multi-Tier Strategy: Different caching approaches for static vs. dynamic content

2. Natural Language Interface

"Show me the latest inventory levels for products with low stock alerts"

Agentic Workflow Example

Complex Multi-Step Query: "Generate a sales report for Q1 with trend analysis and recommendations"

Step 1: Cache Check → System prompts (CACHE HIT - 0ms)
Step 2: Tool Execution → Fetch Q1 sales data (Database query - 45ms)
Step 3: Cache Decision → Store raw data (TTL: 1 hour - data changes daily)
Step 4: Tool Execution → Calculate trends (Analysis tool - 23ms)
Step 5: Cache Decision → Store trends (TTL: 30 min - calculations may vary)
Step 6: Tool Execution → Generate recommendations (AI reasoning - 67ms)
Step 7: Cache Decision → Short TTL (5 min - recommendations are context-specific)
Step 8: Response Assembly → Final formatted report (8ms)

Total Time: 143ms (vs. 3+ seconds with vector retrieval)
Cache Strategy: Multi-tier with intelligent TTL based on data volatility

This demonstrates how FACT's agentic system makes nuanced decisions about what to cache and for how long, something impossible with static vector approaches. This query is automatically transformed into optimized tool execution and returns formatted results in milliseconds.

3. MCP Tool-Based Architecture

FACT employs the Model Context Protocol for secure, standardized tool execution:

Read-Only Data Access: Prevents data modification
Input Validation: Comprehensive query validation
Audit Trail: Complete logging of all operations
Security Patterns: Advanced injection protection

4. Hybrid Execution Model

Integration with cloud services enables intelligent routing between local and remote execution:

Local Execution: Speed-optimized for simple queries
Cloud Execution: Feature-rich for complex analytics
Automatic Failover: Seamless degradation handling
Performance Optimization: Real-time execution path selection

Core Concepts

Three-Tier Architecture

Tier 1: User Interface Layer
├── Natural Language Query Processing
├── Interactive CLI Interface
├── REST API Endpoints
└── Real-time Response Formatting

Tier 2: FACT Driver & Intelligence Layer
├── Intelligent Caching System
├── Query Analysis and Optimization
├── Execution Path Routing
├── Security Validation
└── Performance Monitoring

Tier 3: Execution & Data Layer
├── Local Tool Execution
├── Arcade.dev Cloud Execution
├── Secure Database Access
└── Result Processing & Caching

Tool-Based Data Retrieval

FACT employs secure, containerized tools for data access:

Available Tools:

SQL.QueryReadonly: Execute SELECT queries on financial databases
SQL.GetSchema: Retrieve database schema information
SQL.GetSampleQueries: Get example queries for exploration
System.GetMetrics: Access performance and system metrics

Cache Hierarchy and Optimization

FACT implements a sophisticated multi-level caching system:

Memory Cache: Immediate access to frequently used queries
Persistent Cache: Long-term storage for common patterns
Distributed Cache: Shared cache across multiple instances
Strategy-Based Selection: Intelligent cache tier selection

Benefits of FACT

Revolutionary Performance Improvements

Speed Transformation

FACT delivers order-of-magnitude improvements over traditional financial data systems:

Cache Hits: Sub-50ms response times (vs. 2-5 seconds traditional)
Cache Misses: Under 140ms average response time
Complex Analytics: 85% faster than traditional RAG systems
Concurrent Processing: 1000+ queries per minute throughput

Cost Optimization Breakthrough

The intelligent caching architecture delivers unprecedented cost efficiency:

90% Cost Reduction: Through automated query result caching
Token Efficiency: Automatic optimization of API token usage
Resource Minimization: No vector databases or complex indexing required
Scalability Economics: Linear cost scaling with exponential performance gains

Operational Excellence

FACT transforms operational characteristics of financial analytics:

99%+ Uptime: Robust error handling and graceful degradation
Zero SQL Knowledge Required: Complete natural language interface
Enterprise Security: Comprehensive audit and compliance features

FACT's Enterprise-Ready Results

With FACT, your system becomes intelligent enough to decide what to cache, when to execute tools, and how to route requests in real time—without guessing. RAG brought retrieval to language models. But FACT makes retrieval intentional, structured, and **e

FACT

Install / Use

README