SkillAgentSearch skills...

Inference

Turn any computer or edge device into a command center for your computer vision projects.

Install / Use

/learn @roboflow/Inference

README

<div align="center"> <p> <a align="center" href="" target="https://inference.roboflow.com/"> <img width="100%" src="https://github.com/roboflow/inference/blob/main/banner.png?raw=true" > </a> </p> <br>

notebooks | supervision | autodistill | maestro

<br>

version downloads docker pulls license

<!-- [![huggingface](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Roboflow/workflows) --> </div>

Make Any Camera an AI Camera

Inference turns any computer or edge device into a command center for your computer vision projects.

  • 🛠️ Self-host your own fine-tuned models
  • 🧠 Access the latest and greatest foundation models (like Florence-2, CLIP, and SAM2)
  • 🤝 Use Workflows to track, count, time, measure, and visualize
  • 👁️ Combine ML with traditional CV methods (like OCR, Barcode Reading, QR, and template matching)
  • 📈 Monitor, record, and analyze predictions
  • 🎥 Manage cameras and video streams
  • 📬 Send notifications when events happen
  • 🛜 Connect with external systems and APIs
  • 🔗 Extend with your own code and models
  • 🚀 Deploy production systems at scale

See Example Workflows for common use-cases like detecting small objects with SAHI, multi-model consensus, active learning, reading license plates, blurring faces, background removal, and more.

Time In Zone Workflow Example

🔥 quickstart

Install Docker (and NVIDIA Container Toolkit for GPU acceleration if you have a CUDA-enabled GPU). Then run

pip install inference-cli && inference server start --dev

This will pull the proper image for your machine and start it in development mode.

In development mode, a Jupyter notebook server with a quickstart guide runs on http://localhost:9001/notebook/start. Dive in there for a whirlwind tour of your new Inference Server's functionality!

Now you're ready to connect your camera streams and start building & deploying Workflows in the UI or interacting with your new server via its API.

🛠️ build with Workflows

A key component of Inference is Workflows, composable blocks of common functionality that give models a common interface to make chaining and experimentation easy.

License Plate OCR Workflow Visualization

With Workflows, you can:

  • Detect, classify, and segment objects in images using state-of-the-art models.
  • Use Large Multimodal Models (LMMs) to make determinations at any stage in a workflow.
  • Seamlessly swap out models for a given task.
  • Chain models together.
  • Track, count, time, measure, and visualize objects.
  • Add business logic and extend functionality to work with your external systems.

Workflows allow you to extend simple model predictions to build computer vision micro-services that fit into a larger application or fully self-contained visual agents that run on a video stream.

Learn more, read the Workflows docs, or start building.

<table border="0" cellspacing="0" cellpadding="0" role="presentation"> <tr> <!-- Left cell (thumbnail) --> <td width="300" valign="top"> <a href="https://youtu.be/aPxlImNxj5A"> <img src="https://img.youtube.com/vi/aPxlImNxj5A/0.jpg" alt="Self Checkout with Workflows" width="300" /> </a> </td> <!-- Right cell (title, date, description) --> <td valign="middle"> <strong> <a href="https://youtu.be/aPxlImNxj5A">Tutorial: Build an AI-Powered Self-Serve Checkout</a> </strong><br /> <strong>Created: 2 Feb 2025</strong><br /><br /> Make a computer vision app that identifies different pieces of hardware, calculates the total cost, and records the results to a database. </td> </tr> <tr> <td width="300" valign="top"> <a href="https://youtu.be/r3Ke7ZEh2Qo"> <img src="https://img.youtube.com/vi/r3Ke7ZEh2Qo/0.jpg" alt="Workflows Tutorial" width="300" /> </a> </td> <td valign="middle"> <strong> <a href="https://youtu.be/r3Ke7ZEh2Qo"> Tutorial: Intro to Workflows </a> </strong><br /> <strong>Created: 6 Jan 2025</strong><br /><br /> Learn how to build and deploy Workflows for common use-cases like detecting vehicles, filtering detections, visualizing results, and calculating dwell time on a live video stream. </td> </tr> <tr> <!-- Left cell (thumbnail) --> <td width="300" valign="top"> <a href="https://youtu.be/tZa-QgFn7jg"> <img src="https://img.youtube.com/vi/tZa-QgFn7jg/0.jpg" alt="Smart Parking with AI" width="300" /> </a> </td> <!-- Right cell (title, date, description) --> <td valign="middle"> <strong> <a href="https://youtu.be/tZa-QgFn7jg">Tutorial: Build a Smart Parking System</a> </strong><br /> <strong>Created: 27 Nov 2024</strong><br /><br /> Build a smart parking lot management system using Roboflow Workflows! This tutorial covers license plate detection with YOLOv8, object tracking with ByteTrack, and real-time notifications with a Telegram bot. </td> </tr> </table>

📟 connecting via api

Once you've installed Inference, your machine is a fully-featured CV center. You can use its API to run models and workflows on images and video streams. By default, the server is running locally on localhost:9001.

To interface with your server via Python, use our SDK:

pip install inference-sdk

Then run an example model comparison Workflow like this:

from inference_sdk import InferenceHTTPClient

client = InferenceHTTPClient(
    api_url="http://localhost:9001", # use local inference server
    # api_key="<YOUR API KEY>" # optional to access your private data and models
)

result = client.run_workflow(
    workspace_name="roboflow-docs",
    workflow_id="model-comparison",
    images={
        "image": "https://media.roboflow.com/workflows/examples/bleachers.jpg"
    },
    parameters={
        "model1": "yolov8n-640",
        "model2": "yolov11n-640"
    }
)

print(result)

In other languages, use the server's REST API; you can access the API docs for your server at /docs (OpenAPI format) or /redoc (Redoc Format).

Check out the inference_sdk docs to see what else you can do with your new server.

🎥 connect to video streams

The inference server is a video processing beast. You can set it up to run Workflows on RTSP streams, webcam devices, and more. It will handle hardware acceleration, multiprocessing, video decoding and GPU batching to get the most out of your hardware.

This example workflow will watch a stream for frames that CLIP thinks match an inputted text prompt.

from inference_sdk import InferenceHTTPClient
import atexit
import time

max_fps = 4

client = InferenceHTTPClient(
    api_url="http://localhost:9001", # use local inference server
    # api_key="<YOUR API KEY>" # optional to access your private data and models
)

# Start a stream on an rtsp stream
result = client.start_inference_pipeline_with_workflow(
    video_reference=["rtsp://user:password@192.168.0.100:554/"],
    workspace_name="roboflow-docs",
    workflow_id="clip-frames",
    max_fps=max_fps,
    workflows_parameters={
        "prompt": "blurry", # change to look for something else
        "threshold": 0.16
    }
)

pipeline_id = result["context"]["pipeline_id"]

# Terminate the pipeline when the script exits
atexit.register(lambda: client.terminate_inference_pipeline(pipeline_id))

while True:
  result = client.c

Related Skills

View on GitHub
GitHub Stars2.2k
CategoryDevelopment
Updated5h ago
Forks250

Languages

Python

Security Score

85/100

Audited on Mar 23, 2026

No findings