Ecp

ECP is a standardized interface for orchestrating, auditing, and enforcing authority limits in AI Agent evaluations. It moves evaluation from "brittle Python scripts" to a deterministic infrastructure protocol

Generate Convert Improve

Install / Use

/learn @evaluation-context-protocol/Ecp

About this skill

Quality Score

0/100

README

Evaluation Context Protocol (ECP)

Work in progress: this repository is actively evolving, and some concepts may change.

A lightweight protocol and reference runtime for evaluating agents with public output, private reasoning, and tool usage. This repo contains:

sdk/ - Python SDK for implementing an ECP agent.
runtime/ - Python runtime (CLI) that runs manifests and grades results.
examples/ - Minimal examples (LangChain demo).
spec/ - Protocol specification.

Documentation

Docs site: https://evaluation-context-protocol.github.io/ecp/
Quickstart: https://evaluation-context-protocol.github.io/ecp/quickstart/
Specification: https://evaluation-context-protocol.github.io/ecp/spec/

Quick Start

Create a venv and install from PyPI:

py -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install ecp-runtime "ecp-sdk[langchain]" langchain-openai

Run the example manifest:

python -m ecp_runtime.cli run --manifest .\examples\langchain_demo\manifest.yaml

Generate an HTML report:

python -m ecp_runtime.cli run --manifest .\examples\langchain_demo\manifest.yaml --report .\report.html

Print a JSON report (useful for CI tooling):

python -m ecp_runtime.cli run --manifest .\examples\langchain_demo\manifest.yaml --json

If your manifest uses llm_judge, set your key:

$env:OPENAI_API_KEY="your_key_here"

Example (LangChain Agent + Manifest)

Agent (LangChain create_agent + tool usage):

from langchain.agents import create_agent
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from ecp import serve
from ecp.adaptors.langchain import ECPLangChainAdapter

@tool
def calculator(expression: str) -> str:
    allowed = set("0123456789+-*/() ")
    if not expression or any(ch not in allowed for ch in expression):
        return "Invalid expression."
    try:
        return str(int(eval(expression, {"__builtins__": {}})))
    except Exception:
        return "Invalid expression."

agent = create_agent(
    model=ChatOpenAI(model="gpt-3.5-turbo", temperature=0),
    tools=[calculator],
    system_prompt="Use the calculator tool for arithmetic."
)

def to_messages(text: str):
    return {"messages": [{"role": "user", "content": text}]}

serve(ECPLangChainAdapter(agent, name="MathBot", input_mapper=to_messages))

Manifest (runtime checks output + tool usage):

manifest_version: "v1"
name: "LangChain Math Check"
target: "python agent.py"

scenarios:
  - name: "Ratio Word Problem"
    steps:
      - input: "Katy makes coffee using teaspoons of sugar and cups of water in the ratio of 7:13..."
        graders:
          - type: text_match
            field: public_output
            condition: contains
            value: "42"
          - type: tool_usage
            tool_name: "calculator"
            arguments: {}