SkillAgentSearch skills...

OSMO

The developer-first platform for scaling complex Physical AI workloads across heterogeneous compute—unifying training GPUs, simulation clusters, and edge devices in a simple YAML

Install / Use

/learn @NVIDIA/OSMO
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<!-- SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. SPDX-License-Identifier: Apache-2.0 --> <img src="./docs/front_cover.png" width="100%"/>

Welcome to OSMO

Workflow Orchestration Purpose-built for Physical AI

<a href="LICENSE"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License"></a> <a href="https://nvidia.github.io/OSMO/main/user_guide"><img src="https://img.shields.io/badge/docs-latest-brightgreen.svg" alt="Documentation"></a> <a href="https://kubernetes.io/"><img src="https://img.shields.io/badge/Kubernetes-Native-326ce5.svg" alt="Kubernetes"></a> <a href="https://brev.nvidia.com/launchable/deploy?launchableID=env-36a6a7qnkOMOP2vgiBRaw2e3jpW"><img src="https://brev-assets.s3.us-west-1.amazonaws.com/nv-lb-dark.svg" alt="Brev deployment"></a>

<a href="#ready-to-begin">Get Started</a> | <a href="#documentation">Documentation</a> | <a href="#community--support">Community</a> | <a href="#roadmap">Roadmap</a>

Use OSMO to manage your workflows, version your datasets and even remotely develop on a backend node. Using OSMO's backend configuration, run your workflows seamlessly on any cloud environment. Build a data factory to manage your synthetic and real robot data, train neural networks with experiment tracking, train robot policies with reinforcement learning, evaluate your models and publish the results, test the robot in simulation with software or hardware in loop (HIL) and automate your workflows on any CI/CD systems

<div align="center"> <img src="./docs/user_guide/overview.svg" width="85%"/> </div>

For Robotics & AI Developers

Write once, run anywhere. Focus on building robots, not managing infrastructure.

# Your entire physical AI pipeline in a YAML file
workflow:
  tasks:
  - name: simulation
    image: nvcr.io/nvidia/isaac-sim
    platform: rtx-pro-6000          # Runs on NVIDIA RTX PRO 6000 GPUs

  - name: train-policy
    image: nvcr.io/nvidia/pytorch
    platform: gb200                 # Runs on NVIDIA GB200 GPUs
    resources:
      gpu: 8
    inputs:                         # Feed the output of simulation task into training
    - task: simulation

  - name: evaluate-thor
    image: my-ros-app
    platform: jetson-agx-thor       # Runs on NVIDIA Jetson AGX Thor
    inputs:
    - task: train-policy            # Feed the output of the training task into eval
    outputs:
    - dataset:
        name: thor-benchmark        # Save the output benchmark into a dataset
  • Zero-Code Workflows – Write workflows in YAML and iterate, not Python scripts
  • Truly Portable – Same workflow runs on laptop (Docker/KIND) or cloud (EKS/AKS/GKE)
  • Interactive Development – Launch VSCode, Jupyter, or SSH & develop remotely on cloud
  • Smart Storage – Content-addressable datasets with deduplication save 10-100x on storage
  • Infrastructure-Agnostic – Workflows never reference specific infrastructure—scale transparently

For Platform & Infrastructure Engineers

Scale infrastructure independently. Add compute backends without disrupting developers.

  • Centralized Control Plane – Single pane of glass for heterogeneous compute across clouds and regions
  • Plug-and-Play Backends – Register new Kubernetes clusters dynamically via CLI
  • Geographic Distribution – Deploy compute wherever it's available—cloud, on-prem, edge
  • Zero-Downtime Changes – Scale GPU compute clusters without affecting users or their workflows

Solving Physical AI

Physical AI development uniquely requires orchestrating three types of compute working together:

| 🧠 Training | 🌐 Simulation | 🤖 Edge | |:---:|:---:|:---:| | GB200, H100 | L40, RTX Pro | Jetson AGX Thor | | Deep learning & RL | Physics & Sensor Rendering | Hardware-in-the-Loop | | Cloud | Cloud | On Premise |

Traditionally, orchestrating workflows across these heterogeneous systems requires custom scripts, infrastructure expertise, and separate tooling for each environment.

OSMO solves this Three Computer Problem for robotics by orchestrating your entire Physical AI pipeline — from training to simulation to hardware testing all in a simple YAML. No custom scripts, no infrastructure expertise required. OSMO orchestrates tasks across heterogeneous Kubernetes clusters, managing dependencies and resource allocation. By solving this fundamental problem, OSMO brings us one step closer towards making Physical AI a reality.

<div align="center"> <img src="./docs/user_guide/tutorials/hardware_in_the_loop/robot_simulation.svg" width="70%"/> </div>

Key Benefits

| What You Can Do | Example | |---------------------|----------------------| | Interactively develop on remote GPU nodes with VSCode, SSH, or Jupyter notebooks | Interactive Workflows | | Generate synthetic data at scale using Isaac Sim or custom simulation environments | Isaac Sim SDG | | Train models with diverse datasets across distributed GPU clusters | Model Training | | Train policies for robots using data-parallel reinforcement learning | Reinforcement Learning | | Validate models in simulation with hardware-in-the-loop testing | Hardware In The Loop | | Transform and post-process data for iterative improvement | Working with Data | | Benchmark system software on actual robot hardware (NVIDIA Jetson, custom platforms) | Hardware Testing |

Battle-Tested in Production

OSMO is production-grade and proven at scale. Originally developed to power Physical AI workloads at NVIDIA—including Project GR00T, Isaac Lab, Isaac Dexterity, Isaac Sim, and Isaac ROS—it orchestrates thousands of GPU-hours daily across heterogeneous compute spanning cloud training clusters to edge devices.

Now open-source and ready for your robotics workflows. Whether you're building humanoid robots, autonomous vehicles, or warehouse automation systems, OSMO provides the same enterprise-grade orchestration used in production at scale.

Ready to Begin?

Select one of the deployment options below depending on your needs and environment to get started

<div align="center"> <a href="https://nvidia.github.io/OSMO/main/deployment_guide/introduction/whats_next.html"> <img src="./docs/deployment_options.svg" width="85%"/> </a> </div>

Deploying on Microsoft Azure? Get started with Azure NVIDIA Reference Architecture. This jointly published architecture delivers a production-ready Physical AI pipeline on Microsoft Azure, integrating OSMO, Isaac Lab, and Isaac Sim with GPU-accelerated RL training, auto-scaling compute, and enterprise-grade Kubernetes security.

Documentation

| Resource | Description | |:---------|:------------| | 🚀 Local Deployment | Run it locally on your workstation in 10 minutes | | ⚡ Brev Deployment | Run it on a Brev instance with a GPU in 10 minutes | | 🛠️ Cloud Deployment | Deploy production grade on cloud providers | | 📘 User Guide | Tutorials, workflows, and how-to guides for developers | | 💡 Cookbook | Robotics workflow examples | 💻 Getting Started | Install command-line interface to get started |

Community & Support

Join the community. We welcome contributions, feedback, and collaboration from developers and AI teams worldwide.

🐛 Report Issues – Bugs, feature requests or technical help

🤝 Contributing Guide - How to develop OSMO and make contributions

Roadmap

Short term (Q1 2026)

| Capability | How It Works | |:---------------|:-----------------| | Simplified Authentication & Authorization | Use your existing identity provider without additional infrastructure. Connect directly to Azure AD, Okta, Google Workspace, or any OAuth 2.0 provider. Manage teams and permissions through simple CLI commands (osmo group ...). Share credentials a

Related Skills

View on GitHub
GitHub Stars116
CategoryDevelopment
Updated3h ago
Forks20

Languages

TypeScript

Security Score

95/100

Audited on Mar 21, 2026

No findings