DetectX
Run Custom YOLO5 models in Axis camera
Install / Use
/learn @pandosme/DetectXREADME
DetectX
Run custom trained data models. This package includes MobileNet SSD COCO model. The idea is to replace this model with your own. Please read Train-Build.md to understand how to train and build the package.
Quick Start
Building the Application
- Clone the repository
- Replace the model and labels (if using your own):
- Place your TFLite model at
app/model/model.tflite - Place your labels file at
app/model/labels.txt
- Place your TFLite model at
- Run the build script:
./build.sh - Install the generated
.eapfile on your Axis camera
Note: As of version 4.0.0, the prepare.py script is no longer required. Model parameters are now automatically extracted during the build process.
DetectX User & Integration Guide
DetectX Model Overview
DetectX is a versatile ACAP (Axis Camera Application Platform) for on-camera, real-time object detection, supporting various detection tasks depending on the bundled model.
Below are model-specific details relevant to the generic "COCO" demo:
| Variant | Dataset | Labels | ARTPEC-8 | ARTPEC-9 | |----------------|-------------|------------------------------|------------------------|--------------------------| | DetectX COCO | COCO | person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light, fire hydrant, stop sign, parking meter, bench, bird, cat, dog, horse, sheep, cow, elephant, bear, zebra, giraffe, backpack, umbrella, handbag, tie, suitcase, frisbee, skis, snowboard, sports ball, kite, baseball bat, baseball glove, skateboard, surfboard, tennis racket, bottle, wine glass, cup, fork, knife, spoon, bowl, banana, apple, sandwich, orange, broccoli, carrot, hot dog, pizza, donut, cake, chair, couch, potted plant, bed, dining table, toilet, TV, laptop, mouse, remote, keyboard, cell phone, microwave, oven, toaster, sink, refrigerator, book, clock, vase, scissors, teddy bear, hair drier, toothbrush | Model input: 640<br>Model size: Small | Model input: 640<br>Model size: Small |
Note: ARTPEC-8 and ARTPEC-9 are Axis camera chipset platforms, with ARTPEC-9 offering enhanced performance and the ability to process larger images for improved detection quality.
Application Overview
DetectX provides real-time detection and state data from network cameras directly to your systems. Intended for system integrators, all outputs are designed for machine-to-machine (M2M) workflows, with flexible configuration from a built-in web UI and standards-based output via MQTT, ONVIF, or HTTP.
Typical use cases include:
- Vehicle and person detection in perimeter security
- Counting and presence analytics
- Intelligence enrichment for video management systems (VMS) or IoT platforms
Menu & Feature Walkthrough
Each menu item below describes both user options and integration outputs, matched to the associated interface screenshot for easy visual orientation.
1. Detections
Allows you to see object detections overlayed on the video, and to adjust detection parameters.
<img src="pictures/Detections.jpeg" alt="Detections Page" width="500"/>- Adjust Confidence Threshold:
Set the minimum confidence (0–100) for labeling a detection as valid. - Set Area of Interest (AOI):
Drag and resize a region to receive detections from only a selected area of the scene. - Configure Minimum Object Size:
Exclude detections smaller than the specified pixel area.
Visualization Notes:
- The overlay updates approximately two times per second (“best effort”). The bounding boxes may lag or not exactly match all detections due to UI and network constraints.
- Use this page for quick confirmation that detection is working and properly tuned.
2. MQTT
Here you configure the gateway between the camera and your backend system.
<img src="pictures/MQTT.jpg" alt="MQTT Page" width="500"/>- Broker Address and Port:
Specify the IP or hostname for your MQTT broker and port (default: 1883). - Authentication:
Optional username and password if security is enforced. - Pre-topic:
The prefix added to all MQTT topics (e.g.,detectx/detection/...). Change if routing multiple cameras. - Additional Metadata:
Name and Location properties help you distinguish events in multi-camera setups.
Connection Status is displayed, along with currently active parameters for fast troubleshooting.
3. Events/Labels
This section allows you to tailor detection and event signaling to your application:
<img src="pictures/Evenst_Labels.jpg" alt="Detection Export Page" width="500"/>- Selectable Labels:
Check or uncheck which object types (labels) are actively processed, reducing false positives or narrowing the scope (e.g., only cars and persons). - Event State Settings:
- Prioritize: Opt for accuracy (suppresses false triggers) or responsiveness.
- Minimum Event State Duration: Avoid chattering by forcing a minimum active/inactive state period for each label.
Note:
Each label produces an independent event state. Tuning event parameters is crucial for noisy or high-traffic scenes.
4. Detection Export
When downstream systems require not only detection data but cropped images for each detection:
<img src="pictures/Detection-Export.jpg" alt="Detection Export Page" width="500"/>- Enable/Disable Detection Cropping
- Set Border Adjustment:
Expand or shrink the crop region around detected objects (e.g., add 25px margin). - Output Methods:
- MQTT: Sends cropped images as base64 payloads.
- HTTP POST: Posts the payload to a configurable endpoint.
- Throttle Output:
Limit image frequency to reduce load or network traffic.
View the Latest Crops
<img src="pictures/crops.jpg" alt="Crops Gallery" width="600"/> - Opens a gallery of up to 10 most recent image crops, labeled by type and confidence. - Essential for quality assurance—check that crops are readable, in correct locations, and correspond to real detections.5. About
<img src="pictures/About.jpg" alt="About Page" width="480"/>A dashboard combining:
- Model Status: Input size, inference time, DLPU backend, and status.
- Device Details: Camera type, firmware, serial, CPU & network usage.
- MQTT Status: Broker and topic configuration, connection health.
- Application Info: Name, version, vendor, support/documentation link.
Use this page as your first check when troubleshooting or confirming installation.
Integration & Payload Examples
DetectX delivers three primary payload types, all enrichable with the configured device name, location, and serial for easy association in your backend systems.
1. Detection (Bounding Box) on MQTT
Topic:
detectx/detection/<serial>
Example Payload:
{
"detections": [
{
"label": "car",
"c": 77,
"x": 274,
"y": 224,
"w": 180,
"h": 104,
"timestamp": 1756453942980,
"refId": 260
}
],
"name": "Front",
"location": "",
"serial": "B8A44F3024BB"
}
2. Event State on MQTT or ONVIF
Topic:
detectx/event/<serial>/<label>/<state>
Example Payload:
{
"label": "car",
"state": false,
"timestamp": 1756453946184,
"name": "Front",
"location": "",
"serial": "B8A44F3024BB"
}
3. Detection Crop Image
MQTT/HTTP Topic or POST:
detectx/crop/<serial>
Example Payload:
{
"label": "truck",
"timestamp": 1756454378759,
"confidence": 47,
"x": 25,
"y": 25,
"w": 218,
"h": 106,
"image": "/9j/4AAQSk...", // JPEG in Base64
"name": "Front",
"location": "",
"serial": "B8A44F3024BB"
}
System Integrator Tips
- Start with the About page to confirm firmware, model, and MQTT status before field adjustments.
- Use Detection and Crops pages for rapid troubleshooting—verify detections visually before integrating triggers or actions.
- Use unique device names/locations in MQTT setup for scalable multi-camera deployments.
- Adjust event suppression and AOI settings based on site/scene context for best accuracy.
Troubleshooting & Support
- If bounding boxes do not appear but the model status is OK, check confidence, AOI, and MQTT broker configuration.
- If crop images are misaligned or cut-off, adjust crop borders and AOI, validating via the “View the latest crops” gallery.
- Monitor CPU and network on the About page to avoid overload (especially on ARTPEC-8 devices).
Version History
4.0.0 February 2, 2026
Major Features
- Pixel-Based Coordinate System: Complete redesign from normalized [0-1000] coordinates to native pixel coordinates matching model input dimensions
- 1:1 Display Aspect Ratio: All scale modes now display in model-sized 1:1 view (typically 640x640) for "what you see is what you get" visualization
- Enhanced Scale Mode Support: True visual representation for each mode:
- Center-Crop: 640x640 1:1 video with no black bars
- Balanced: 856x640 (4:3 aspect) squeezed into 1:1 display
- Letterbox: 1136x640 (16:9 aspect) displayed in 1:1 with visible padding
- Modern UI Redesign: Complete redesign with top navigation bar replacing sidebar, modern card-based layouts, and improved visual hierarchy
- Simplified Build Process: Removed dependency on
prepare.py- model parameters now automatically extracted during Docker build
Coordinate System Changes
- Pixel Coordinates Throughout: All coordinates (detections, AOI, size filters) now use p
Related Skills
node-connect
345.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
104.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
345.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
345.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
