SkillAgentSearch skills...

InterPose

(3DV 2026) Pytorch implementation of “InterPose: Learning to Generate Human-Object Interactions from Large-Scale Web Videos”

Install / Use

/learn @Mael-zys/InterPose
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

(3DV 2026) InterPose: Learning to Generate Human-Object Interactions from Large-Scale Web Videos

This is the official implementation for paper: "InterPose: Learning to Generate Human-Object Interactions from Large-Scale Web Videos".

[Project Page] [InterPose data] [Paper]

InterPose Teaser

TODO

  • [x] Release InterPose dataset
  • [x] Data collection framework
  • [ ] Spatial control experiments (Training and evaluation)
    • [x] Physics-based model: MaskedMimic
    • [ ] Kinematics-based model: OmniControl
  • [x] Zero-shot human-object interaction experiments (Evaluation)
  • [x] Application: HOI-Agent (integrate LLM to enable zero-shot HOI generation in 3D scenes)

Data collection framework

The automatic human motion data collection and annotation pipeline are released now in repo InterPose-data-collection.

Environment Setup

Note: This code was developed with Python 3.8, CUDA 11.7 and PyTorch 2.0.0.

Clone the repo.

git clone git@github.com:Mael-zys/InterPose.git --recursive
cd InterPose/

Install environment

bash scripts/install_InterPose.sh

Prerequisites

  • Please download SMPL-X and put the model to data/smpl_all_models/.

  • Please download all the processed data and put in processed_data.

  • Install ProtoMotion (MaskedMimic) environment and download pretrained models according to README in third-party/ProtoMotion_for_InterPose.

  • (Optional) If you would like to generate visualizations, please download Blender first.

Evaluation: zero-shot generation on OMOMO and BEHAVE dataset.

bash scripts/eval_zero_shot_HOI_generation.sh results/masked_mimic_merged/last.ckpt

Here is an example visualization script:

bash scripts/visualization.sh

HOI-Agent: integrate LLM to enable zero-shot generation in 3D scenes

First config OPENAI_API_KEY in run_HOI_agent.py. Then run the following script:

bash scripts/run_HOI_agent.sh results/masked_mimic_merged/last.ckpt

Citation

@article{zhang2025interpose,
  title={InterPose: Learning to Generate Human-Object Interactions from Large-Scale Web Videos},
  author={Zhang, Yangsong and Butt, Abdul Ahad and Varol, G{\"u}l and Laptev, Ivan},
  journal={arXiv},
  year={2025},
}

Related Repos

We adapted some code from other repos in data processing, learning, evaluation, etc. Please check these useful repos.

https://github.com/lijiaman/chois_release
https://github.com/NVlabs/ProtoMotions/tree/main
https://github.com/lijiaman/omomo_release
View on GitHub
GitHub Stars25
CategoryContent
Updated7d ago
Forks1

Languages

Python

Security Score

90/100

Audited on Mar 24, 2026

No findings