SkillAgentSearch skills...

agent-evaluation

Use this when you need to EVALUATE OR IMPROVE or OPTIMIZE an existing LLM agent's output quality - including improving tool selection accuracy, answer quality, reducing costs, or fixing issues where the agent gives wrong/incomplete responses. Evaluates agents systematically using MLflow evaluation with datasets, scorers, and tracing. IMPORTANT - Always also load the instrumenting-with-mlflow-tracing skill before starting any work. Covers end-to-end evaluation workflow or individual components (tracing setup, dataset creation, scorer definition, evaluation execution).

Install / Use

# Copy GEMINI.md from https://github.com/Paldom/databricks-apps-fastapi-starter/blob/main/.gemini/skills/agent-evaluation/SKILL.md
About this skill

Gemini Rules

Gemini CLI config

Quality Score

33/100

Category

Automation

Supported Platforms

Gemini CLI

Tags

Related Skills

View on GitHub
GitHub Stars0
CategoryAutomation
Updated2h ago
Forks0

Security Score

80/100

Audited on Mar 26, 2026

1 medium1 low