MoCapAct
A Multi-Task Dataset for Simulated Humanoid Control
Install / Use
/learn @microsoft/MoCapActREADME
MoCapAct
<p align="center"> <img src="https://raw.githubusercontent.com/mhauskn/mocapact.github.io/master/assets/MoCapAct.gif" alt="montage" width="70%"> </p> <img src="https://cdla.dev/wp-content/uploads/sites/52/2017/10/cdla_logo.png" alt="Dataset License" width="150"/>
<b>Paper: MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control</b>
This is the codebase for the MoCapAct project, which uses motion capture (MoCap) clips to learn low-level motor skills for the "CMU Humanoid" from the <tt>dm_control</tt> package. This repo contains all code to:
- train the clip snippet experts,
- collect expert rollouts into a dataset,
- download our experts and rollouts from the command line,
- perform policy distillation,
- perform reinforcement learning on downstream tasks, and
- perform motion completion.
For more information on the project and to download the entire dataset, please visit the project website.
For users interested in development, we recommend reading the following documentation on <tt>dm_control</tt>:
- The <tt>dm_control</tt> whitepaper
- The <tt>dm_control</tt> README
- The README for <tt>dm_control</tt>'s locomotion task library
Setup
MoCapAct requires Python 3.7+. We recommend that you use a virtual environment. For example, using conda:
conda create -n MoCapAct pip python==3.8
conda activate MoCapAct
To install the package, we recommend cloning the repo and installing the local copy:
git clone https://github.com/microsoft/MoCapAct.git
cd MoCapAct
pip install -e .
Dataset
The MoCapAct dataset consists of clip experts trained on the MoCap snippets and the rollouts from those experts.
We provide the dataset and models via the MoCapAct collection on Hugging Face. This collection consists of two pages:
- A model zoo which contains the clip snippet experts, multiclip policies, RL-trained policies for the transfer tasks, and the GPT policy.
- A dataset page which contains the small rollout dataset and large rollout dataset.
Description
<details> <summary>Clip snippet experts</summary> We signify a clip snippet expert by the snippet it is tracking. Taking <tt>CMU_009_12-165-363</tt> as an example expert, the file hierarchy for the snippet expert is:CMU_009_12-165-363
├── clip_info.json # Contains clip ID, start step, and end step
└── eval_rsi/model
├── best_model.zip # Contains policy parameters and hyperparameters
└── vecnormalize.pkl # Used to get normalizer for observation and reward
The expert policy can be loaded using our repository:
from mocapact import observables
from mocapact.sb3 import utils
expert_path = "data/experts/CMU_009_12-165-363/eval_rsi/model"
expert = utils.load_policy(expert_path, observables.TIME_INDEX_OBSERVABLES)
from mocapact.envs import tracking
from dm_control.locomotion.tasks.reference_pose import types
dataset = types.ClipCollection(ids=['CMU_009_12'], start_steps=[165], end_steps=[363])
env = tracking.MocapTrackingGymEnv(dataset)
obs, done = env.reset(), False
while not done:
action, _ = expert.predict(obs, deterministic=True)
obs, rew, done, _ = env.step(action)
print(rew)
</details>
<details>
<summary>Expert rollouts</summary>
The expert rollouts consist of a collection of HDF5 files, one per clip.
An HDF5 file contains expert rollouts for each constituent snippet as well as miscellaneous information and statistics.
To facilitate efficient loading of the observations, we concatenate all the proprioceptive observations (joint angles, joint velocities, actuator activations, etc.) from an episode into a single numerical array and provide indices for the constituent observations in the <tt>observable_indices</tt> group.
Taking <tt>CMU_009_12.hdf5</tt> (which contains three snippets) as an example, we have the following HDF5 hierarchy:
CMU_009_12.hdf5
├── n_rsi_rollouts # R, number of rollouts from random time steps in snippet
├── n_start_rollouts # S, number of rollouts from start of snippet
├── ref_steps # Indices of MoCap reference relative to current time step. Here, (1, 2, 3, 4, 5).
├── observable_indices
│ └── walker
│ ├── actuator_activation # (0, 1, ..., 54, 55)
│ ├── appendages_pos # (56, 57, ..., 69, 70)
│ ├── body_height # (71)
│ ├── ...
│ └── world_zaxis # (2865, 2866, 2867)
│
├── stats # Statistics computed over the entire dataset
│ ├── act_mean # Mean of the experts' sampled actions
│ ├── act_var # Variance of the experts' sampled actions
│ ├── mean_act_mean # Mean of the experts' mean actions
│ ├── mean_act_var # Variance of the experts' mean actions
│ ├── proprio_mean # Mean of the proprioceptive observations
│ ├── proprio_var # Variance of the proprioceptive observations
│ └── count # Number of observations in dataset
│
├── CMU_009_12-0-198 # Rollouts for the snippet CMU_009_12-0-198
├── CMU_009_12-165-363 # Rollouts for the snippet CMU_009_12-165-363
└── CMU_009_12-330-529 # Rollouts for the snippet CMU_009_12-330-529
Each snippet group contains $R+S$ snippets. The first $S$ episodes correspond to episodes initialized from the start of the snippet and the last $R$ episodes to episodes initialized at random points in the snippet. We now uncollapse the <tt>CMU_009_12-165-363</tt> group within the HDF5 file to reveal the rollout structure:
CMU_009_12-165-363
├── early_termination # (R+S)-boolean array indicating which episodes terminated early
├── rsi_metrics # Metrics for episodes that initialize at random points in snippet
│ ├── episode_returns # R-array of episode returns
│ ├── episode_lengths # R-array of episode lengths
│ ├── norm_episode_returns # R-array of normalized episode rewards
│ └── norm_episode_lengths # R-array of normalized episode lengths
├── start_metrics # Metrics for episodes that initialize at start in snippet
│
├── 0 # First episode, of length T
│ ├── observations
│ │ ├── proprioceptive # (T+1)-array of proprioceptive observations
│ │ ├── walker/body_camera # (T+1)-array of images from body camera **(not included)**
│ │ └── walker/egocentric_camera # (T+1)-array of images from egocentric camera **(not included)**
│ ├── actions # T-array of sampled actions executed in environment
│ ├── mean_actions # T-array of corresponding mean actions
│ ├── rewards # T-array of rewards from environment
│ ├── values # T-array computed using the policy's value network
│ └── advantages # T-array computed using generalized advantage estimation
│
├── 1 # Second episode
├── ...
└── R+S-1 # (R+S)th episode
To keep the dataset size manageable, we do not include image observations in the dataset.
The camera images can be logged by providing the flags --log_all_proprios --log_cameras to the mocapact/distillation/rollout_experts.py script.
The HDF5 rollouts can be read and utilized in Python:
import h5py
dset = h5py.File("data/small_dataset/CMU_009_12.hdf5", "r")
print("Expert actions from first rollout episode of second snippet:")
print(dset["CMU_009_12-165-363/0/actions"][...])
We provide a "large" dataset where $R = S = 100$ (with size 600 GB) and a "small" dataset where $R = S = 10$ (with size 50 GB).
</details>Examples
Below are Python commands we used for our paper.
<details> <summary>Clip snippet experts</summary>Training a clip snippet expert:
python -m mocapact.clip_expert.train \
--clip_id [CLIP_ID] `# e.g., CMU_016_22` \
--start_step [START_STEP] `# e.g., 0` \
--max_steps [MAX_STEPS] `# e.g., 210 (can be larger than clip length)` \
--n_workers [N_CPU] `# e.g., 8` \
--log_root experts \
$(cat cfg/clip_expert/train.txt)
Evaluating a clip snippet expert (numerical evaluation and visual evaluation):
python -m mocapact.clip_expert.evaluate \
--policy_root [POLICY_ROOT] `# e.g., experts/CMU_016-22-0-82/0/eval_rsi/model` \
--n_workers [N_CPU] `# e.g., 8` \
--n_eval_episodes 1000 `# set to 0 to just run the visualizer` \
$(cat cfg/clip_expert/evaluate.txt)
We can also load the experts in Python:
from mocapact import observables
from mocapact.sb3 import utils
expert_path = "experts/CMU_016_22-0-82/0/eval_rsi/model" # example path
expert = utils.load_policy(expert_path, observables.TIME_INDEX_OBSERVABLES)
from mocapact.envs import tracking
from dm_control.locomotion.tasks.reference_pose import types
dataset = types.ClipCollection(ids=['CMU_016_22'])
env = tracking.MocapTrackingGymEnv(dataset)
obs, done = env.reset(), False
while not done:
action, _ = expert.predict(obs, deterministic=True)
obs, rew, done, _ = env.step(action)
print(rew)
</details>
<details>
<summary>Creating rollout dataset</summary>
Rolling out a collection of experts and collecting into a dataset:
p
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
openclaw-plugin-loom
Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
