WebAR.rocks.object
Lightweight WebGL and JavaScript library for real time object detection, tracking and 6DoF pose estimation in the browser π¦. Included in WebAR.rocks.train for custom neural network training.
Install / Use
/learn @WebAR-rocks/WebAR.rocks.objectREADME
JavaScript/WebGL lightweight object detection and tracking library for WebAR
<p align="center"> <a href='https://youtu.be/a09NSXp_ENU'><img src='https://img.youtube.com/vi/a09NSXp_ENU/0.jpg'></a> <br/> <i><a href='https://webar.rocks/demos/object/demos/threejs/ARCoffee/' target='_blank'>Standalone AR Coffee</a> - Enjoy a free coffee offered by <a href='https:/webar.rocks'>WebAR.rocks</a>!<br/> The coffee cup is detected and a 3D animation is played in augmented reality.<br/> This demo only relies on WebAR.rocks.object and THREE.JS.</i> </p>π NEW: Since March 2025, you can train your own neural network using your 3D model(s) with WebAR.rocks.train.
Table of contents
- Features
- Object specifications
- Architecture
- Demonstrations
- Specifications
- Neural network models
- About the tech
- License
- References
Features
Here are the main features of the library:
- object detection
- camera video feed capture using a helper
- on the fly neural network change
- demonstrations with WebXR integration
Object specifications
Target object
The target object must have an aspect ratio between 1/2.5 and 2.5. An object with an aspect ratio of 1 fits into a square (equal width and height). For example, the standard Red Bull can has an aspect ratio of 2.5 (height/diameter).
Elongated objects, such as a fork, a pen, or a knife, do not meet this requirement. In such cases, it may be easier to target only a specific part of the object (e.g., the end of the fork). We only detect objects that fully fit within the camera's field of view (i.e., objects that are not partially visible).
We can train a neural network to detect up to three different objects simultaneously. The first detected object is then tracked (we currently do not support simultaneous multi-object tracking). The recognized objects should have approximately the same aspect ratio.
Highly reflective objects, such as shiny metallic items, are harder to detect. Similarly, refractive materials are more challenging due to their high variability.
3D model
We donβt need any picture of the object but a 3D model. The 3D model should be in one of the following file formats: .OBJ, .GLTF, or .GLB. The textures should have power-of-two dimensions, and their highest dimension (width or height) must be 2048 pixels or less.
If necessary, the 3D model should embed the PBR textures (typically the metallic-roughness texture).
We provide 3D modelling support.
Architecture
/demos/: source code of the demonstrations,/dist/: heart of the library:WebARRocksObject.js: main minified script,
/helpers/: scripts which can help you to use this library in some specific use cases (like WebXR),/libs/: 3rd party libraries and 3D engines used in the demos,/neuralNets/: neural network models,/reactViteThreeFiberDemos: Demos with Vite/NPM/React/Three Fiber.
Demonstrations
Standalone static JS demos
These demonstrations work in a standard web browser. They only require camera access. They are written in static JavaScript
- Simple object recognition using the camera (for debugging): live demo source code
- Cat recognition: live demo source code Youtube video
- THREE.js Sprite 33cl (12oz) can detection demo: source code live demo
- Standalone AR Coffee demo: source code live demo Youtube video
- Keyboard detection and tracking demo: source code live demo. Coffee on keyboard demo
Standalone ES6 demos
These demonstrations have been written in a modern front-end environment using:
- NPM/Vite/ES6 as environment
- React
- Three.js through Three Fiber
You can browse adn try them in the /reactViteThreeFiberDemos directory.
WebXR viewer demos
To run these demonstrations, you need a web browser implementing WebXR. We hope it will be implemented soon in all web browsers!
- If you have and IOS device (Ipad, Iphone), you can install WebXR viewer from the Apple store. It is developped by the Mozilla Fundation. It is a modified Firefox with WebXR implemented using ArKit. You can then open the demonstrations from the URL bar of the application.
- For Android devices, it should work with WebARonARCore, but we have not tested yet. Your device should still be compatible with ARCore.
Then you can run these demos:
- WebXR object labelling: live demo source code
- WebXR coffee: live demo source code Youtube video
Specifications
Get started
The most basic integration example of this library is the first demo, the debug detection demo.
In index.html, we include in the <head> section the main library script, /dist/WebARRocksObject.js, the MediaStramAPI (formerly called getUserMedia API) helper, /helpers/WebARRocksMediaStreamAPIHelper.js and the demo script, demo.js:
<script src = "../../dist/WebARRocksObject.js"></script>
<script src = "../../helpers/WebARRocksMediaStreamAPIHelper.js"></script>
<script src = "demo.js"></script>
In the <body> section of index.html, we put a <canvas> element which will be used to initialize the WebGL context used by the library for deep learning computation, and to possibly display a debug rendering:
<canvas id = 'debugWebARRocksObjectCanvas'></canvas>
Then, in demo.js, we get the camera video feed after the loading of the page using the MediaStream API helper:
WebARRocksMediaStreamAPIHelper.get(DOMVIDEO, init, function(){
alert('Cannot get video bro :(');
}, {
video: true //mediaConstraints
audio: false
})
You can replace this part by a static video, and you can also provide Media Contraints to specify the video resolution.
When the video feed is captured, the callback function init is launched. It initializes this library:
function init(){
WEBARROCKSOBJECT.init({
canvasId: 'debugWebARRocksObjectCanvas',
video: DOMVIDEO,
callbackReady: function(errLabel){
if (errLabel){
alert('An error happens bro: ',errLabel);
} else {
load_neuralNet();
}
}
});
}
The function load_neuralNet loads the neural network model:
function load_neuralNet(){
WEBARROCKSOBJECT.set_NN('../../neuralNets/NN_OBJ4_0.json', function(errLabel){
if (errLabel){
console.log('ERROR: cannot load the neural net', errLabel);
} else {
iterate();
}
}, options);
}
Instead of giving the URL of the neural network, you can also give the parsed JSON object.
The function iterate starts the iteration loop:
function iterate(){
const detectState = WEBARROCKSOBJECT.detect(3);
if (detectState.label){
console.log(detectState.label, 'IS DETECTED YEAH !!!');
}
window.requestAnimationFrame(iterate);
}
Initialization arguments
The WEBARROCKSOBJECT.init takes a dictionary as argument with these properties:
<video> video: HTML5 video element (can come from the MediaStream API helper). Iffalse, update the source texture from avideoFrameBuffer objectprovided when callingWEBARROCKSOBJECT.detect(...)(like in WebXR demos),<dict> videoCrop: see Video cropping section for more details<function> callbackReady: callback function launched when ready or if there was an error. Called with the error label orfalse,<string> canvasId: id of the canvas from which the WebGL context used for deep learning processing will be created,<canvas> canvas: ifcanvasIdis not provided, you can also provide directly the<canvas>element<dict> scanSettings: see Scan settings section for more details<boolean> isDebugRender: Boolean. If true, a debug rendering will be displayed on the<canvas>element. Useful for debugging, but it should be set tofalsefor production because it wastes GPU computing resources,<int> canvasSize: size of the detection canvas in pixels (should be square). Special value-1keep the canvas size. Default:512.
Related Skills
node-connect
333.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
333.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.0kCommit, push, and open a PR
