Opik
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Install / Use
/learn @comet-ml/OpikREADME
<a id="-what-is-opik"></a>
🚀 What is Opik?
Opik (built by Comet) is an open-source platform designed to streamline the entire lifecycle of LLM applications. It empowers developers to evaluate, test, monitor, and optimize their models and agentic systems. Key offerings include:
- Comprehensive Observability: Deep tracing of LLM calls, conversation logging, and agent activity.
- Advanced Evaluation: Robust prompt evaluation, LLM-as-a-judge, and experiment management.
- Production-Ready: Scalable monitoring dashboards and online evaluation rules for production.
- Opik Agent Optimizer: Dedicated SDK and set of optimizers to enhance prompts and agents.
- Opik Guardrails: Features to help you implement safe and responsible AI practices.
Key capabilities include:
-
Development & Tracing:
- Track all LLM calls and traces with detailed context during development and in production (Quickstart).
- Extensive 3rd-party integrations for easy observability: Seamlessly integrate with a growing list of frameworks, supporting many of the largest and most popular ones natively (including recent additions like Google ADK, Autogen, and Flowise AI). (Integrations)
- Annotate traces and spans with feedback scores via the Python SDK or the UI.
- Experiment with prompts and models in the Prompt Playground.
-
Evaluation & Testing:
- Automate your LLM application evaluation with Datasets and Experiments.
- Leverage powerful LLM-as-a-judge metrics for complex tasks like hallucination detection, moderation, and RAG assessment (Answer Relevance, Context Precision).
- Integrate evaluations into your CI/CD pipeline with our PyTest integration.
-
Production Monitoring & Optimization:
- Log high volumes of production traces: Opik is designed for scale (40M+ traces/day).
- Monitor feedback scores, trace counts, and token usage over time in the Opik Dashboard.
- Utilize Online Evaluation Rules with LLM-as-a-Judge metrics to identify production issues.
- Leverage Opik Agent Optimizer and Opik Guardrails to continuously improve and secure your LLM applications in production.
<br>[!TIP] If you are looking for features that Opik doesn't have today, please raise a new Feature request 🚀
<a id="%EF%B8%8F-opik-server-installation"></a>
🛠️ Opik Server Installation
Get your Opik server running in minutes. Choose the option that best suits your needs:
Option 1: Comet.com Cloud (Easiest & Recommended)
Access Opik instantly without any setup. Ideal for quick starts and hassle-free maintenance.
👉 Create your free Comet account
Option 2: Self-Host Opik for Full Control
Deploy Opik in your own environment. Choose between Docker for local setups or Kubernetes for scalability.
Self-Hosting with Docker Compose (for Local Development & Testing)
This is the simplest way to get a local Opik instance running. Note the new ./opik.sh installation script:
On Linux or Mac Environment:
# Clone the Opik repository
git clone https://github.com/comet-ml/opik.git
# Navigate to the repository
cd opik
# Start the Opik platform
./opik.sh
On Windows Environment:
# Clone the Opik repository
git clone https://github.com/comet-ml/opik.git
# Navigate to the repository
cd opik
# Start the Opik platform
powershell -ExecutionPolicy ByPass -c ".\\opik.ps1"
Service Profiles for Development
The Opik installation scripts now support service profiles for different development scenarios:
# Start full Opik suite (default behavior)
./opik.sh
# Start only infrastructure services (databases, caches etc.)
./opik.sh --infra
# Start infrastructure + backend services
./opik.sh --backend
# Enable guardrails with any profile
./opik.sh --guardrails # Guardrails with full Opik suite
./opik.sh --backend --guardrails # Guardrails with infrastructure + backend
U
Related Skills
node-connect
325.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
80.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
325.6kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
80.2kCommit, push, and open a PR
