TokenHSI
[CVPR 2025 Oral] TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization
Install / Use
/learn @liangpan99/TokenHSIREADME
🏠 About
<div style="text-align: center;"> <img src="https://github.com/liangpan99/TokenHSI/blob/page/static/images/teaser.png" width=100% > </div> Introducing TokenHSI, a unified model that enables physics-based characters to perform diverse human-scene interaction tasks. It excels at seamlessly unifying multiple <b>foundational HSI skills</b> within a single transformer network and flexibly adapting learned skills to <b>challenging new tasks</b>, including skill composition, object/terrain shape variation, and long-horizon task completion. </br>📹 Demo
<p align="center"> <img src="assets/longterm_demo_isaacgym.gif" align="center" width=60% > <br> Long-horizon Task Completion in a Complex Dynamic Environment </p> <!-- ## 🕹 Pipeline <div style="text-align: center;"> <img src="https://github.com/liangpan99/TokenHSI/blob/page/static/images/pipeline.jpg" width=100% > </div> -->🔥 News
- [2025-04-07] <b>Released full code. Please note to download the latest datasets and models from Hugging Face.</b>
- [2025-04-06] Released three skill composition tasks with pre-trained models.
- [2025-04-05] TokenHSI has been selected as an oral paper at CVPR 2025! 🎉
- [2025-04-03] Released long-horizon task completion with a pre-trained model.
- [2025-04-01] We just updated the Getting Started section. You can play TokenHSI now!
- [2025-03-31] We've released the codebase and checkpoint for the foundational skill learning part.
📝 TODO List
- [x] Release foundational skill learning
- [x] Release policy adaptation - skill composition
- [x] Release policy adaptation - object shape variation
- [x] Release policy adaptation - terrain shape variation
- [x] Release policy adaptation - long-horizon task completion
📖 Getting Started
Dependencies
Follow the following instructions:
-
Create new conda environment and install pytroch
conda create -n tokenhsi python=3.8 conda activate tokenhsi conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia pip install -r requirements.txt -
Install IsaacGym Preview 4
cd IsaacGym_Preview_4_Package/isaacgym/python pip install -e . # add your conda env path to ~/.bashrc export LD_LIBRARY_PATH="your_conda_env_path/lib:$LD_LIBRARY_PATH" -
Install pytorch3d (optional, if you want to run the long-horizon task completion demo)
We use pytorch3d to rapidly render height maps of dynamic objects for thousands of simulation environments.
conda install -c fvcore -c iopath -c conda-forge fvcore iopath pip install git+https://github.com/facebookresearch/pytorch3d.git@v0.7.7 -
Download SMPL body models and organize them as follows:
|-- assets |-- body_models |-- smpl |-- SMPL_FEMALE.pkl |-- SMPL_MALE.pkl |-- SMPL_NEUTRAL.pkl |-- ... |-- lpanlib |-- tokenhsi
Motion & Object Data
We provide two methods to generate the motion and object data.
-
Download pre-processed data from Hugging Face. Please follow the instruction in the dataset page.
-
Generate data from source:
-
Download AMASS (SMPL-X Neutral), SAMP, and OMOMO.
-
Modify dataset paths in
tokenhsi/data/dataset_cfg.yamlfile.# Motion datasets, please use your own paths amass_dir: "/YOUR_PATH/datasets/AMASS" samp_pkl_dir: "/YOUR_PATH/datasets/samp" omomo_dir: "/YOUR_PATH/datasets/OMOMO/data" -
We still need to download the pre-processed data from Hugging Face. But now we only require the object data.
-
Run the following script:
bash tokenhsi/scripts/gen_data.sh
-
Checkpoints
Download checkpoints from Hugging Face. Please follow the instruction in the model page.
🕹️ Play TokenHSI!
-
Single task policy trained with AMP
-
Path-following
# test sh tokenhsi/scripts/single_task/traj_test.sh # train sh tokenhsi/scripts/single_task/traj_train.sh -
Sitting
# test sh tokenhsi/scripts/single_task/sit_test.sh # train sh tokenhsi/scripts/single_task/sit_train.sh -
Climbing
# test sh tokenhsi/scripts/single_task/climb_test.sh # train sh tokenhsi/scripts/single_task/climb_train.sh -
Carrying
# test sh tokenhsi/scripts/single_task/carry_test.sh # train sh tokenhsi/scripts/single_task/carry_train.sh
-
-
TokenHSI's unified transformer policy
-
Foundational Skill Learning
# test sh tokenhsi/scripts/tokenhsi/stage1_test.sh # eval sh tokenhsi/scripts/tokenhsi/stage1_eval.sh carry # we need to specify a task to eval, e.g., traj, sit, climb, or carry. # train sh tokenhsi/scripts/tokenhsi/stage1_train.shIf you successfully run the test command, you will see:
<p align="center"> <img src="assets/stage1_demo.gif" align="center" width=60% > </p> -
Policy Adaptation - Skill Composition
-
Traj + Carry
# test sh tokenhsi/scripts/tokenhsi/stage2_comp_traj_carry_test.sh # eval sh tokenhsi/scripts/tokenhsi/stage2_comp_traj_carry_eval.sh # train sh tokenhsi/scripts/tokenhsi/stage2_comp_traj_carry_train.shIf you successfully run the test command, you will see:
-
Sit + Carry
# test sh tokenhsi/scripts/tokenhsi/stage2_comp_sit_carry_test.sh # eval sh tokenhsi/scripts/tokenhsi/stage2_comp_sit_carry_eval.sh # train sh tokenhsi/scripts/tokenhsi/stage2_comp_sit_carry_train.shIf you successfully run the test command, you will see:
-
Climb + Carry
# test sh tokenhsi/scripts/tokenhsi/stage2_comp_climb_carry_test.sh # eval sh tokenhsi/scripts/tokenhsi/stage2_comp_climb_carry_eval.sh # train sh tokenhsi/scripts/tokenhsi/stage2_comp_climb_carry_train.shIf you successfully run the test command, you will see:
-
-
Policy Adaptation - Object Shape Variation
-
Carrying: Box-2-Chair
# test sh tokenhsi/scripts/tokenhsi/stage2_object_chair_test.sh # eval sh tokenhsi/scripts/tokenhsi/stage2_object_chair_eval.sh # train sh tokenhsi/scripts/tokenhsi/stage2_object_chair_train.shIf you successfully run the test command, you will see:
-
Carrying: Box-2-Table
# test sh tokenhsi/scripts/tokenhsi/stage2_object_table_test.sh # eval sh tokenhsi/scripts/tokenhsi/stage2_object_table_eval.sh # train sh tokenhsi/scripts/tokenhsi/stage2_object_table_train.shIf you successfully run the test command, you will see:
-
-
Policy Adaptation - Terrain Shape Variation
-
Path-following
# test sh tokenhsi/scripts/tokenhsi/stage2_terrain_traj_test.sh # eval sh tokenhsi/scripts/tokenhsi/stage2_terrain_traj_eval.sh # train sh
-
-
Related Skills
node-connect
347.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
