SkillAgentSearch skills...

Ecp

ECP is a standardized interface for orchestrating, auditing, and enforcing authority limits in AI Agent evaluations. It moves evaluation from "brittle Python scripts" to a deterministic infrastructure protocol

Install / Use

/learn @evaluation-context-protocol/Ecp
About this skill

Quality Score

0/100

Category

Operations

Supported Platforms

Zed

README

Evaluation Context Protocol (ECP)

Status License

Work in progress: this repository is actively evolving, and some concepts may change.

A lightweight protocol and reference runtime for evaluating agents with public output, private reasoning, and tool usage. This repo contains:

  • sdk/ - Python SDK for implementing an ECP agent.
  • runtime/ - Python runtime (CLI) that runs manifests and grades results.
  • examples/ - Minimal examples (LangChain demo).
  • spec/ - Protocol specification.

Documentation

  • Docs site: https://evaluation-context-protocol.github.io/ecp/
  • Quickstart: https://evaluation-context-protocol.github.io/ecp/quickstart/
  • Specification: https://evaluation-context-protocol.github.io/ecp/spec/

Quick Start

Create a venv and install from PyPI:

py -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install ecp-runtime "ecp-sdk[langchain]" langchain-openai

Run the example manifest:

python -m ecp_runtime.cli run --manifest .\examples\langchain_demo\manifest.yaml

Generate an HTML report:

python -m ecp_runtime.cli run --manifest .\examples\langchain_demo\manifest.yaml --report .\report.html

Print a JSON report (useful for CI tooling):

python -m ecp_runtime.cli run --manifest .\examples\langchain_demo\manifest.yaml --json

If your manifest uses llm_judge, set your key:

$env:OPENAI_API_KEY="your_key_here"

Example (LangChain Agent + Manifest)

Agent (LangChain create_agent + tool usage):

from langchain.agents import create_agent
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from ecp import serve
from ecp.adaptors.langchain import ECPLangChainAdapter

@tool
def calculator(expression: str) -> str:
    allowed = set("0123456789+-*/() ")
    if not expression or any(ch not in allowed for ch in expression):
        return "Invalid expression."
    try:
        return str(int(eval(expression, {"__builtins__": {}})))
    except Exception:
        return "Invalid expression."

agent = create_agent(
    model=ChatOpenAI(model="gpt-3.5-turbo", temperature=0),
    tools=[calculator],
    system_prompt="Use the calculator tool for arithmetic."
)

def to_messages(text: str):
    return {"messages": [{"role": "user", "content": text}]}

serve(ECPLangChainAdapter(agent, name="MathBot", input_mapper=to_messages))

Manifest (runtime checks output + tool usage):

manifest_version: "v1"
name: "LangChain Math Check"
target: "python agent.py"

scenarios:
  - name: "Ratio Word Problem"
    steps:
      - input: "Katy makes coffee using teaspoons of sugar and cups of water in the ratio of 7:13..."
        graders:
          - type: text_match
            field: public_output
            condition: contains
            value: "42"
          - type: tool_usage
            tool_name: "calculator"
            arguments: {}

ECP in 60 Seconds

ECP is JSON-RPC 2.0 over stdio. The runtime launches your agent process and calls:

  • agent/initialize
  • agent/step
  • agent/reset

Your agent replies with a structured result containing:

  • public_output (what the user sees)
  • private_thought (for evaluators)
  • tool_calls (actions taken)

See spec/protocol.md for the full protocol.

Repo Layout

  • sdk/python/src/ecp - SDK decorators and server loop
  • runtime/python/src/ecp_runtime - CLI, runner, graders
  • examples/langchain_demo - LangChain-based demo agent and manifest

Status

This project is evolving quickly. Expect changes between minor versions.

Related Skills

View on GitHub
GitHub Stars7
CategoryOperations
Updated3d ago
Forks1

Languages

Python

Security Score

90/100

Audited on Apr 4, 2026

No findings