VideoMimic

Visual Imitation Enables Contextual Humanoid Control. CoRL 2025, Best Student Paper Award.

Generate Convert Improve

Install / Use

/learn @hongsukchoi/VideoMimic

About this skill

Quality Score

0/100

README

VideoMimic

CoRL 2025, Best Student Paper Award.

[project page] [arxiv] [proceedings]

Visual Imitation Enables Contextual Humanoid Control.

<div style="background-color: #333; padding: 16px 20px; border-radius: 8px; color: #eee; font-family: sans-serif; line-height: 1.6;"> <div style="font-size: 14px; margin-bottom: 12px;"> Arthur Allshire<sup>*</sup>, Hongsuk Choi<sup>*</sup>, Junyi Zhang<sup>*</sup>, David McAllister<sup>*</sup>, Anthony Zhang, Chung Min Kim, Trevor Darrell, Pieter Abbeel, Jitendra Malik, Angjoo Kanazawa (*Equal contribution) </div> <div style="font-size: 14px;"> <i>University of California, Berkeley</i> </div> </div>

Updates

Sep 30, 2025: Videomimic won the Best Student Paper Award.
Sep 15, 2025: Simulation code and preliminary sim2real code released.
Jul 6, 2025: Initial real-to-sim pipeline release.

VideoMimic Real-to-Sim

VideoMimic’s real-to-sim pipeline reconstructs 3D environments and human motion from single-camera videos and retargets the motion to humanoid robots for imitation learning. It extracts human poses in world coordinates, maps them to robot configurations, and reconstructs environments as pointclouds later converted to meshes.

VideoMimic Simulation

Provides sim training pipeline. See readme for details. It proceeds in 4 stages including motion capture pretraining, scene-conditioned tracking, distillation, and RL finetuning.

VideoMimic Sim-to-Real

Provides real world deployment pipeline. See readme for details. We provide a C++ file which you can compile to a binary to run on your real robot using torchscript-exported checkpoints.

Video Dataset

Uploaded here. Note that individual videos are provided as sequences of jpegs rather than encoded mp4s.

[BibTex]

@inproceedings{allshire2025visual,
  title={Visual Imitation Enables Contextual Humanoid Control},
  author={Allshire, Arthur and Choi, Hongsuk and Zhang, Junyi and McAllister, David and Zhang, Anthony and Kim, Chung Min and Darrell, Trevor and Abbeel, Pieter and Malik, Jitendra and Kanazawa, Angjoo},
  booktitle={Proceedings of The Conference on Robot Learning},
  series={Proceedings of Machine Learning Research},
  year={2025}
}

Related Skills

docs-writer

99.3k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

338.0k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

ddd

Guía de Principios DDD para el Proyecto > 📚 Documento Complementario : Este documento define los principios y reglas de DDD. Para ver templates de código, ejemplos detallados y guías paso

zola-ai

An autonomous Solana wallet agent that executes payments via Twitter mentions and an in-app dashboard, powered by Claude.