AgenticRAGTracer

No description available

Generate Convert Improve

Install / Use

/learn @YqjMartin/AgenticRAGTracer

About this skill

Quality Score

0/100

README

AgenticRAGTracer: A Hop-Aware Benchmark for Diagnosing Multi-Step Retrieval Reasoning in Agentic RAG

📁 Repository Structure

AgenticRAGTrace
├── multihop_pipeline.py      # Core multi-hop generation pipeline
├── multihop_run.py           # Parallel runner for multi-hop QA generation
├── multihop_prompt.yaml      # The prompts used in the multi-hop generation pipeline
├── evaluation.py             # Agentic RAG evaluation script
├── retriever_serving.py      # Retriever service
├── retriever_config.yaml     # Retriever configuration
└── README.md

🔎 Retriever Service

First, RAG serving needs to be started. Below are some of the configurations that need to be made in the code.

# retriever_config.yaml
gpu_id: ""
retrieval_method: "e5"
retrieval_model_path: "e5-base-v2"
index_path: "e5_flat_inner.index" 
faiss_gpu: False
corpus_path: "wiki18_100w.jsonl"

# retriever_serving.py
python retriever_serving.py \
  --config retriever_config.yaml \
  --port 8000

🔄 Multi-hop Data Generation

You can use the code of multihop_run.py and multihop_pipeline.py to build the Multihop QA of AgenticRAG. Below are some of the configurations you need to make in the code.

# multihop_pipeline.py
API_URL = ""                        # The URL address of the LLM API you are using
API_KEY = ""                        # API key
DEFAULT_MODEL = ""                  # The model you want to use for generation

# multihop_run.py
python multihop_run.py

📊 Evaluation

You can use the code in evaluation.py to evaluate the LLM that you want to assess. Below are some of the configurations you need to make.

MAX_WORKERS =                    # Maximum number of working threads
MODEL_ID = ""                    # The name of the LLM you wish to evaluate
API_BASE =  ""                   # API base URL
API_URL = ""                     # Full URL of the API
API_KEY = ""                     # API key    
DEFAULT_MODEL = ""               # The name of the LLM used for the LLM judge
INPUT_JSONL_LIST = []            # The path of our benchmark

Related Skills

node-connect

348.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

108.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

348.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

348.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。