SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation

https://github.com/user-attachments/assets/e2d0db98-2e31-46aa-9480-c4c6f4a48f7d

Overview

This repository contains the official implementation of the SimToolReal framework, which was introduced in SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation. It consists of:

Simulation Environments: Isaac Gym environments for training and evaluation of dexterous tool manipulation policies.
DexToolBench: A benchmark for dexterous tool manipulation.
Reinforcement Learning (RL) Training: RL training algorithms for training dexterous tool manipulation policies.
Deployment: Policy deployment in simulation and the real world.

Project Structure

simtoolreal
  ├── assets
  │   └── // Assets such as robot URDF files, object models, etc.
  ├── baselines
  │   └── // Implementation of kinematic retargeting and fixed grasp
  ├── deployment
  │   └── // Sim-to-real and sim-to-sim deployment of the policy
  ├── dextoolbench
  │   ├── data
  │   │   └── // DexToolBench data (needs to be downloaded)
  │   ├── // Scripts for evaluating policies on DexToolBench
  │   └── // Scripts for visualizing DexToolBench objects and trajectories
  ├── docs
  │   └── // Documentation
  ├── isaacgymenvs
  │   └── // Simulation environment for training and evaluating policies
  ├── pretrained_policy
  │   └── // Checkpoint of the pretrained policy (needs to be downloaded)
  ├── recorded_data
  │   └── // Interface and tools for saving, loading, and visualizing recorded data
  └── rl_games
      └── // RL algorithms, including PPO and SAPG

External repos: FoundationPose — Perception system (SAM + FoundationPose pose tracking)

Installation

Please see the Installation documentation for more details.

Quick Start

Please run all commands from the root directory of this repository.For most commands, you can add --help to see the available options.

Interactive Evaluation of a Pretrained Policy on DexToolBench

Download Pretrained Policy

First, download the pretrained policy to pretrained_policy/.

python download_pretrained_policy.py

This will result in the following directory structure:

pretrained_policy/
  ├── config.yaml  // Configuration file for the policy
  └── model.pth  // Checkpoint of the policy

Run Interactive Evaluation

Then, run the interactive evaluation script with the pretrained policy:

python dextoolbench/eval_interactive.py \
--config-path pretrained_policy/config.yaml \
--checkpoint-path pretrained_policy/model.pth

This launches a web-based interactive demo (default at http://localhost:8080) where you can select the tool category, object instance, and task from dropdown menus, then load the environment and run episodes. You can optionally specify a custom port with --port for the viser server.

https://github.com/user-attachments/assets/58eb188b-662c-4190-8148-29710c9eb20f

The following is the full DexToolBench data structure:

# ── Full DexToolBench data structure ──────────────────────────────────────────
# {object_category: {object_name: [task_name, ...]}}
DEXTOOLBENCH_DATA_STRUCTURE: Dict[str, Dict[str, List[str]]] = {
    "hammer": {
        "claw_hammer": ["swing_down", "swing_side"],
        "mallet_hammer": ["swing_down", "swing_side"],
    },
    "marker": {
        "sharpie_marker": ["draw_smile", "write_c"],
        "staples_marker": ["draw_smile", "write_c"],
    },
    "eraser": {
        "flat_eraser": ["wipe_smile", "wipe_c"],
        "handle_eraser": ["wipe_smile", "wipe_c"],
    },
    "brush": {
        "blue_brush": ["sweep_forward", "sweep_right"],
        "red_brush": ["sweep_forward", "sweep_right"],
    },
    "spatula": {
        "flat_spatula": ["serve_plate", "flip_over"],
        "spoon_spatula": ["serve_plate", "flip_over"],
    },
    "screwdriver": {
        "long_screwdriver": ["spin_vertical", "spin_horizontal"],
        "short_screwdriver": ["spin_vertical", "spin_horizontal"],
    },
}

See dextoolbench/objects.py and assets/urdf/dextoolbench/<object_category>/<object_name>/<object_name>.urdf for more details about the objects.

See dextoolbench/trajectories for the list of task names following the directory structure dextoolbench/trajectories/<object_category>/<object_name>/<task_name>.json, which is the output of dextoolbench/process_poses.py. These .json files are poses specified in world frame.

Policy Learning in Simulation

WandB Setup

Training logs are tracked with Weights & Biases. Before training, log in and update the wandb_entity in isaacgymenvs/launch_training.py to your own WandB entity:

wandb login

Training a New Policy

To train a policy from scratch, run the following command:

python isaacgymenvs/launch_training.py \
--custom_experiment_name my_experiment

Finetuning a Trained Policy

To finetune a trained policy, run the following command:

python isaacgymenvs/launch_training.py \
--custom_experiment_name my_finetuning_experiment \
--checkpoint <checkpoint_path>

For example:

python isaacgymenvs/launch_training.py \
--custom_experiment_name my_finetuning_experiment \
--checkpoint pretrained_policy/model.pth

If you run out of GPU memory, you can reduce the number of environments by setting --num_envs to a smaller number. Note that num_envs must be divisible by num_blocks (default 6).

python isaacgymenvs/launch_training.py \
--custom_experiment_name my_finetuning_experiment_12288 \
--checkpoint pretrained_policy/model.pth \
--num_envs 12288

DexToolBench

Downloading the DexToolBench Dataset

To list all available options, run:

python download_dextoolbench_data.py --list

To download the data for a specific task, run:

python download_dextoolbench_data.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down

To download the data for a specific object, run:

python download_dextoolbench_data.py \
--object_category hammer \
--object_name claw_hammer

To download the data for a specific category, run:

python download_dextoolbench_data.py \
--object_category hammer

To download all data, run:

python download_dextoolbench_data.py

For each task, it will download the data into the dextoolbench/data/<object_category>/<object_name>/<task_name>/ directory with the following structure:

dextoolbench/data/<object_category>/<object_name>/<task_name>/
├── cam_K.txt  // Camera intrinsics
├── depth  // Depth images
├── masks  // Object masks
├── poses.json  // Object poses in robot frame
└── rgb  // RGB images

Visualize 1 Demo

To visualize 1 demo:

python dextoolbench/visualize_demo.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down

https://github.com/user-attachments/assets/b7532984-6642-497b-a20c-4aa6ed486cf2

Object Models

See dextoolbench/objects.py for the list of object models.

Visualizing the Objects

To visualize a DexToolBench object:

python dextoolbench/visualize_object.py \
--urdf_path assets/urdf/dextoolbench/hammer/claw_hammer/claw_hammer.urdf

To visualize all DexToolBench objects:

python dextoolbench/visualize_all_objects.py

To visualize training objects:

python dextoolbench/generate_training_objects.py
python dextoolbench/visualize_training_objects.py

Visualizing the Task Trajectories

To visualize a DexToolBench task trajectory:

python dextoolbench/visualize_task.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down

To visualize all DexToolBench task trajectories:

python dextoolbench/visualize_all_tasks.py

https://github.com/user-attachments/assets/a5e631af-9afd-4410-9273-c4eab3c48e60

Manually Creating a Task Trajectory

To manually create a task trajectory:

python dextoolbench/interactive_create_task_trajectory.py \
--object_category hammer \
--object_name claw_hammer \
--task_name my_new_task

Evaluating a Trained Policy

To numerically evaluate a trained policy on DexToolBench:

python dextoolbench/run_all_evals.py

Manually Adjusting the Object Models

Use this to manually adjust the position and orientation of the object's origin frame, as well as the object's scale.

python dextoolbench/interactive_adjust_object.py \
--mesh_path assets/urdf/dextoolbench/hammer/claw_hammer/claw_hammer.obj \
--output_dir assets/urdf/dextoolbench/hammer/new_claw_hammer

Data Collection and Processing

To collect new task demonstrations from the real world, you need a ZED camera and the FoundationPose fork (installed in a separate environment). The pipeline is: record RGB-D video → extract object mesh with SAM 3D (TODO) → extract 6D poses with FoundationPose → process into DexToolBench task trajectories.

See data_collection_and_processing.md for the full step-by-step guide.

Deployment

Sim2Real

For Sim2Real policy deployment, we need to run the following nodes:

RL Policy Node: Takes in observations, runs policy to get raw actions, converts to joint position targets, and publishes these targets.
Goal Pose Node: Stores a sequence of goal poses, takes in object pose, updates current goal pose if dist(goal, object) < threshold, and publishes the current goal pose.
Perception Node: Takes in RGB-D images, uses SAM and FoundationPose to get object

Simtoolreal

Install / Use

README