FACT
FACT – Fast Augmented Context Tools: FACT is a lean retrieval pattern that skips vector search. We cache every static token inside Claude Sonnet‑4 and fetch live facts only through authenticated tools hosted on Arcade.dev. The result is deterministic answers, fresh data, and sub‑100 ms latency.
Install / Use
/learn @ruvnet/FACTQuality Score
Category
Development & EngineeringSupported Platforms
README
FACT: Fast Augmented Context Tools
A revolutionary approach to LLM data retrieval that replaces RAG with prompt caching and deterministic tool execution under the Model Context Protocol
TL;DR
FACT (Fast Augmented Context Tools) introduces a new paradigm for language model–powered data retrieval by replacing vector-based retrieval with a prompt-and-tool approach under the Model Context Protocol (MCP). The result? Sub-100ms responses, 60-90% cost reduction, and deterministic, auditable results with no vector stores required.
Why FACT? RAG Had Its Moment. It's Time for Something Smarter.
RAG (Retrieval-Augmented Generation) made sense when vector search was the best we had. But vectors are slow, fuzzy, and expensive to maintain. They're inherently imprecise, forcing you to tune similarity thresholds, re-embed documents, and accept that relevance is always a bit of a guess.
What we needed was something explicit. Deterministic. Cheap. Fast.
FACT isn't about fetching similar chunks of data. It's about giving models structured, exact answers via tool execution and pairing that with intelligent prompt caching. Prompt caches work like brains with memory. Tools act like hands that do. And when you combine the two—prompt caching + MCP-based tools—you can skip vector search entirely.
Instead of saying "Find me something like this," FACT says: "Run this exact SQL call. Return this live API result. Use this schema. Cache the output."
Introduction to FACT
FACT (Fast Augmented Context Tools) introduces a new paradigm for language model–powered data retrieval by replacing vector-based retrieval with a prompt-and-tool approach under the Model Context Protocol (MCP). Instead of relying on embeddings and similarity searches, FACT combines intelligent prompt caching with deterministic tool invocation to deliver fresh, precise, and auditable results.
Key Differences from RAG
FACT represents a fundamental shift from traditional RAG (Retrieval-Augmented Generation) approaches:
Retrieval Mechanism
- RAG: Embeddings → Vector search → LLM completion
- FACT: Prompt cache → MCP tool calls → LLM refinement
Data Freshness
- RAG: Periodic re-indexing required
- FACT: Live data via on-demand tool execution
Accuracy
- RAG: Probabilistic, fuzzy matches
- FACT: Exact outputs from SQL, API, or custom tools
Cost & Latency
- RAG: Embedding + lookup + token costs
- FACT: Cache hits eliminate tokens; cache misses trigger fast tool calls
Core Architectural Innovation
Traditional RAG Approach:
User Query → Embedding → Vector Search → Context Retrieval → LLM → Response (2-5 seconds)
FACT MCP Approach:
User Query → Prompt Cache → [If Miss] → MCP Tool Execution → Cache Update → Response (50-200ms)
Agentic Engineering & Intelligent Caching
FACT enables agentic workflows where AI systems make intelligent decisions about data retrieval, caching, and tool execution in complex, multi-step processes. Unlike static vector databases that treat all data equally, FACT implements intelligent caching that understands the dynamic nature of different data types.
The Vector Problem with Dynamic Data
Vectors excel at static content that changes infrequently, but they're fundamentally ill-suited for:
- Real-time data that changes moment-by-moment
- Request-specific context that varies per user or session
- Dynamic calculations that depend on current parameters
- Time-sensitive information with specific TTL requirements
When data needs to change request-by-request with precise time-to-live characteristics, vectors are the worst possible choice.
Intelligent Cache Decision-Making
FACT's caching system makes sophisticated decisions about what to cache and when:
Cache Strategy Engine:
├── Static Content → Long-term cache (hours/days)
│ ├── System prompts and schemas
│ ├── Configuration data
│ └── Reference documentation
├── Semi-Dynamic → Medium-term cache (minutes/hours)
│ ├── Database schemas
│ ├── User preferences
│ └── System metrics
└── Dynamic Content → Short-term cache (seconds/minutes)
├── Live API responses
├── Real-time calculations
└── User-specific queries
Recursive Tool Execution & Feedback Loops
FACT supports complex agentic patterns:
- Tool Chaining: Output from one tool becomes input for the next
- Conditional Execution: Tools execute based on previous results
- Feedback Loops: Systems learn from execution patterns to optimize caching
- Self-Optimization: Cache strategies adapt based on usage patterns
What Makes FACT Different
1. Intelligent Cache-First Design Philosophy
FACT leverages Claude's native caching with intelligent decision-making to store and reuse responses automatically, eliminating the need for complex vector databases or RAG systems:
- Context-Aware Caching: System determines optimal cache duration based on data type
- Adaptive TTL Management: Cache expiration varies by content volatility
- Smart Invalidation: Proactive cache updates based on data change patterns
- Multi-Tier Strategy: Different caching approaches for static vs. dynamic content
2. Natural Language Interface
Powered by Claude Sonnet-4, FACT understands complex queries in natural language:
"Show me the latest inventory levels for products with low stock alerts"
Agentic Workflow Example
Complex Multi-Step Query: "Generate a sales report for Q1 with trend analysis and recommendations"
Step 1: Cache Check → System prompts (CACHE HIT - 0ms)
Step 2: Tool Execution → Fetch Q1 sales data (Database query - 45ms)
Step 3: Cache Decision → Store raw data (TTL: 1 hour - data changes daily)
Step 4: Tool Execution → Calculate trends (Analysis tool - 23ms)
Step 5: Cache Decision → Store trends (TTL: 30 min - calculations may vary)
Step 6: Tool Execution → Generate recommendations (AI reasoning - 67ms)
Step 7: Cache Decision → Short TTL (5 min - recommendations are context-specific)
Step 8: Response Assembly → Final formatted report (8ms)
Total Time: 143ms (vs. 3+ seconds with vector retrieval)
Cache Strategy: Multi-tier with intelligent TTL based on data volatility
This demonstrates how FACT's agentic system makes nuanced decisions about what to cache and for how long, something impossible with static vector approaches. This query is automatically transformed into optimized tool execution and returns formatted results in milliseconds.
3. MCP Tool-Based Architecture
FACT employs the Model Context Protocol for secure, standardized tool execution:
- Read-Only Data Access: Prevents data modification
- Input Validation: Comprehensive query validation
- Audit Trail: Complete logging of all operations
- Security Patterns: Advanced injection protection
4. Hybrid Execution Model
Integration with cloud services enables intelligent routing between local and remote execution:
- Local Execution: Speed-optimized for simple queries
- Cloud Execution: Feature-rich for complex analytics
- Automatic Failover: Seamless degradation handling
- Performance Optimization: Real-time execution path selection
Core Concepts
Three-Tier Architecture
Tier 1: User Interface Layer
├── Natural Language Query Processing
├── Interactive CLI Interface
├── REST API Endpoints
└── Real-time Response Formatting
Tier 2: FACT Driver & Intelligence Layer
├── Intelligent Caching System
├── Query Analysis and Optimization
├── Execution Path Routing
├── Security Validation
└── Performance Monitoring
Tier 3: Execution & Data Layer
├── Local Tool Execution
├── Arcade.dev Cloud Execution
├── Secure Database Access
└── Result Processing & Caching
Tool-Based Data Retrieval
FACT employs secure, containerized tools for data access:
Available Tools:
- SQL.QueryReadonly: Execute SELECT queries on financial databases
- SQL.GetSchema: Retrieve database schema information
- SQL.GetSampleQueries: Get example queries for exploration
- System.GetMetrics: Access performance and system metrics
Cache Hierarchy and Optimization
FACT implements a sophisticated multi-level caching system:
- Memory Cache: Immediate access to frequently used queries
- Persistent Cache: Long-term storage for common patterns
- Distributed Cache: Shared cache across multiple instances
- Strategy-Based Selection: Intelligent cache tier selection
Benefits of FACT
Revolutionary Performance Improvements
Speed Transformation
FACT delivers order-of-magnitude improvements over traditional financial data systems:
- Cache Hits: Sub-50ms response times (vs. 2-5 seconds traditional)
- Cache Misses: Under 140ms average response time
- Complex Analytics: 85% faster than traditional RAG systems
- Concurrent Processing: 1000+ queries per minute throughput
Cost Optimization Breakthrough
The intelligent caching architecture delivers unprecedented cost efficiency:
- 90% Cost Reduction: Through automated query result caching
- Token Efficiency: Automatic optimization of API token usage
- Resource Minimization: No vector databases or complex indexing required
- Scalability Economics: Linear cost scaling with exponential performance gains
Operational Excellence
FACT transforms operational characteristics of financial analytics:
- 99%+ Uptime: Robust error handling and graceful degradation
- Zero SQL Knowledge Required: Complete natural language interface
- Enterprise Security: Comprehensive audit and compliance features
FACT's Enterprise-Ready Results
With FACT, your system becomes intelligent enough to decide what to cache, when to execute tools, and how to route requests in real time—without guessing. RAG brought retrieval to language models. But FACT makes retrieval intentional, structured, and **e
