AcolyteRAG
Pure-Python, zero-dependency RAG memory engine for conversational AI. Retrieves semantically relevant messages from conversation history using two-phase retrieval, TF-IDF + concept-overlap scoring, narrative element extraction, and bidirectional typo correction. No embeddings or vector DB required.
Install / Use
/learn @pastor0711/AcolyteRAGREADME
AcolyteRAG
AcolyteRAG is the open-source RAG (Retrieval-Augmented Generation) engine powering AcolyteAI. It retrieves semantically relevant messages from a conversation history to provide context for language model generation.
Features
- Two-phase retrieval — fast Jaccard pre-filter → detailed TF-IDF + concept-overlap scoring with 10 blended signals
- Narrative element extraction — automatically identifies emotions, actions, locations, and named entities in text
- Bidirectional typo correction — OSA distance-based fuzzy matching corrects misspellings in both queries and candidates
- Diversity clustering — concept-based clustering ensures retrieved memories cover different topics
- Extensible concept registry — 36 built-in semantic groups; add any domain in one line
- Token-budget mode — fill a context window to a target token count instead of a fixed count
- Concept Manager GUI — local web app for managing concepts and tuning scoring weights with live preview
- Sync + async —
retrieve_related_messagesandretrieve_related_messages_async - Zero external dependencies — pure Python stdlib
Installation
# Clone and install in editable mode
git clone https://github.com/pastor0711/AcolyteRAG.git
cd AcolyteRAG
pip install -e .
Quick Start
from acolyterag import retrieve_related_messages
history = [
{"role": "user", "content": "Tell me about Python."},
{"role": "assistant", "content": "Python is a high-level language known for readability."},
# ... more messages
]
results = retrieve_related_messages(history, query_text="Python web frameworks", max_retrieved=3)
for msg in results:
print(msg["content"]) # prefixed with [RELATED_MEMORY]
Adding Your Own Concepts
The concept registry drives semantic matching. Extend it at any time:
from acolyterag import register_concepts
register_concepts("cooking", ["bake", "roast", "saute", "simmer", "fry", "broil"])
register_concepts("legal", ["contract", "lawsuit", "plaintiff", "verdict", "appeal"])
register_concepts("medical", ["diagnosis", "symptom", "treatment", "prognosis", "dosage"])
Changes take effect immediately for all subsequent retrieval calls. Remove a group with unregister_concepts("cooking").
Multi-word phrases are also supported — registering "best friend" will match the phrase as a single token during tokenization.
Concept Manager GUI
A local web app for managing concept groups and tuning scoring weights without touching code.
python concept_manager.py
# Opens http://localhost:7842
Concepts Tab
- Sidebar — lists all concept groups with word counts; searchable
- Main panel — shows words as removable tags; add new words via the input field
- New group — click "+ New group", name it, add optional seed words
- Delete group — removes the group and its words; automatically cleans up any scoring dimension references
Scoring Tab
- Narrative Scoring Dimensions — manage which concept groups map to each scoring dimension (emotions, actions, locations) with adjustable weights per dimension
- Blend Weights — sliders for all 10 scoring signals (TF-IDF, concept overlap, narrative, etc.)
- Live Preview — enter a query and candidate message to compute a real-time similarity score using your current (including unsaved) weight configuration
- Save All — persists updated weights to
scoring.pyand dimension groups tonarrative_elements.py
All changes are written directly to the source files (concepts.py, scoring.py, narrative_elements.py) with automatic rollback on failure.
API Reference
Retrieval
retrieve_related_messages(
history, # List[Dict[str, str]] — full conversation
query_text, # str — current query
max_retrieved=4, # max memories to return
exclude_last_n=6, # skip the N most recent messages
enable_clustering=True, # diversity clustering
importance_weight=0.3, # blend of relevance vs message importance (0–1)
enable_token_based_retrieval=False, # fill a token budget instead of fixed count
target_token_count=14000, # token budget target
current_token_count=0, # tokens already used
max_retrieved_for_token_target=50,
)
retrieve_related_messages_async() accepts the same arguments and runs in a thread-pool executor.
Concept Registry
| Function | Description |
|---|---|
| register_concepts(group, words) | Add words to a concept group (creates if new) |
| unregister_concepts(group) | Remove a concept group |
| list_concept_groups() | List all registered groups |
Analysis Utilities
| Function | Description |
|---|---|
| summarize_conversation_window(history, start, end) | Short text summary of a history slice |
| extract_conversation_topics(history, min_frequency) | Token frequency map of recurring topics |
| get_memory_statistics(history) | Message counts, entities, avg importance, topics |
| create_memory_index(history, chunk_size) | Chunked index with summaries and metadata |
How It Works
Query
│
▼
Tokenize (normalize → stem → filter stopwords → detect multi-word phrases)
│
▼
Expand concepts (map tokens to semantic groups)
│
▼
Extract narrative elements (emotions, actions, locations, entities)
│
▼
Fast pass (Jaccard overlap + importance + recency) → top 30 candidates
│ (60 for complex queries)
▼
Bidirectional typo correction (OSA distance, both query ↔ candidate)
│
▼
Build IDF over fast-pass candidates
│
▼
Detailed pass (10-signal blend: TF-IDF + concepts + narrative + bigrams + bonuses)
│
▼
Semantic signal filter (require at least one token/concept/entity overlap)
│
▼
Diversity clustering → round-robin pick from each cluster
│
▼
Return tagged memories [RELATED_MEMORY] in chronological order
Scoring Mechanics (Blend Weights)
AcolyteRAG uses 10 configurable blend weights during the detailed scoring pass to determine semantic relevance. These can be adjusted dynamically via the Concept Manager UI:
tfidf(20%): TF-IDF Cosine Similarity. Measures keyword frequency, giving higher importance to rare/unique words over common ones.idf_overlap(15%): IDF Overlap Ratio. Calculates what percentage of the "important" (rare) meaning in the query is captured by the candidate text.concept(20%): Concept Matching. Evaluates the Jaccard overlap of semantic concept groups between the query and candidate, finding matching ideas even with different terminology.bigram(5%): Bigram Overlap. Checks for exact 2-word phrase matches, rewarding candidates that preserve word order.token_coef(5%): Token Overlap Coefficient. Shared tokens divided by the shorter text's token count — prevents penalizing longer texts.narrative(25%): Narrative Similarity. Extracts narrative elements and checks for shared emotions (30%), actions (30%), entities/characters (25%), and locations (20%). Ensures thematic and emotional alignment.substring(10%): Substring Bonus. A flat +0.2 bonus if the entire query (>10 chars) is found perfectly intact within the candidate.entity_bonus(5%): Entity Matching Bonus. Awards up to +0.3 bonus if both texts reference the same named entities (capitalized words, with sentence-start false-positive filtering).temporal_bonus(10%): Temporal Context Bonus. Adds +0.1 flat bonus if both texts contain time-anchoring words (e.g., "tomorrow", "yesterday", "later").action_bonus(10%): Action/Event Bonus. Adds +0.1 flat bonus if both texts contain event-driven action verbs (e.g., "arrives", "happened", "took place").
The final retrieval score blends relevance with message importance: (1 - importance_weight) × detailed_score + importance_weight × importance_score.
Module Structure
| Module | Purpose |
|---|---|
| constants.py | Stopwords, irregular verb lemma map, regex patterns |
| text_processing.py | Normalization, tokenization, custom stemmer/lemmatizer |
| concepts.py | Concept registry (36 groups, ~525 words), dynamic registration API |
| concept_expansion.py | Maps tokens to their semantic concept groups |
| narrative_elements.py | Extracts emotions, actions, locations, and named entities from text |
| importance.py | Heuristic message importance scoring (length, emotions, actions, entities) |
| scoring.py | TF-IDF, Jaccard, overlap coefficient, 10-signal blended scoring engine |
| retrieval.py | Two-phase retrieval pipeline, typo correction, clustering, token-budget mode |
| analysis.py | Conversation summarization, topic extraction, memory statistics, indexing |
| concept_manager.py | Local HTTP server + REST API for the management GUI |
| manager.html/css/js | Web frontend for the Concept Manager |
Built-in Concept Groups
36 semantic groups organized into 6 categories:
| Category | Groups | |---|---| | Emotions | love, anger, fear, sadness, happiness, trust, betrayal, surprise, guilt | | Actions | meeting, conflict, help, travel, work, create, destroy, investigate | | Locations | home, workplace, school, outdoors, social_venue | | Relationships | family, friendship, romance, authority, adversary | | Themes | mental_state, psychology, health, past, future, communication, knowledge, morality, power, mystery | | Genres | fantasy, sci_fi, horror |
See concepts.py for the full word lists.
Testing
# Run the full test suite
pytest
# Run a quick smoke test
python smoke_test.py
The test suite includes ~97 tests covering retrieval accuracy, scoring mechanics, typo correction, concept manager HTTP API, file persistence with rollback, cache invalidation, and input validation.
Related Skills
node-connect
353.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
111.6kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
111.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
353.1kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
