SkillAgentSearch skills...

EasyHOI

[CVPR2025] EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild

Install / Use

/learn @lym29/EasyHOI
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild

GitHub arXiv Project Page

<div style="text-align: center;"> <img src="docs/teaser_gif/clip1.gif" alt="Description of GIF" style="width:100%;"> </div> <div style="text-align: center;"> <img src="docs/teaser_gif/clip2.gif" alt="Description of GIF" style="width:100%;"> </div> <div style="text-align: center;"> <img src="docs/teaser_gif/clip3.gif" alt="Description of GIF" style="width:100%;"> </div> <div style="text-align: center;"> <img src="docs/teaser_gif/clip4.gif" alt="Description of GIF" style="width:100%;"> </div> <div style="text-align: center;"> <img src="docs/teaser_gif/clip5.gif" alt="Description of GIF" style="width:100%;"> </div>

EasyHOI is a pipeline designed for reconstructing hand-object interactions from single-view images.


✅ TODO

  • [x] Provide the code for utilizing the Tripo3D API to improve reconstruction quality - Completed on 2024-12-24.
  • [x] Resolve issues in segmentation. - Completed on 2025-01-02
  • [ ] Integrate the code execution environments into one.
  • [ ] Complete a one-click demo.

📑 Table of Contents

  1. Installation
  2. Usage
  3. Acknowledgements

🛠️ Installation

Download MANO models from the official website and place the mano folder inside the ./assets directory. After setting up, the directory structure should look like this:

assets/
├── anchor/
├── mano/
│ ├──models/
│ ├──webuser/
│ ├──__init__.py
│ ├──__LICENSE.txt
├── contact_zones.pkl
├── mano_backface_ids.pkl

Create the environment for optimization:

conda create -n easyhoi python=3.9
conda activate easyhoi
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda env update --file environment.yaml

Install pytorch3d follow the official instruction.

Install HaMeR and ViTPose:

cd third_party
git clone https://github.com/ViTAE-Transformer/ViTPose.git
cd ./hamer
pip install -e .[all]
cd ../ViTPose
pip install -v -e .

Additional Environments

Since I haven’t resolved the conflict between the environments yet, it’s necessary to create several virtual environments called afford_diff, lisa, and instantmesh. Please refer to the links below to set up these environments.

  • afford_diff: https://github.com/NVlabs/affordance_diffusion/blob/master/docs/install.md

  • lisa: https://github.com/dvlab-research/LISA

  • instantmesh: https://github.com/TencentARC/InstantMesh?tab=readme-ov-file

Thanks to the authors of these wonderful projects. I will resolve the environment conflicts as soon as possible and provide a more user-friendly demo.

🚀 Usage

Initial Reconstruction of the Hand and Object

Set the data directory by running the following command:

export DATA_DIR="./data"

Place your images in the $DATA_DIR/images folder. If you prefer a different path, ensure it contains a subfolder named images.

Step 1: Hand pose estimation, get hand mask from hamer

conda activate easyhoi
python preprocess/recon_hand.py --data_dir $DATA_DIR

Step 2: Segment hand mask and object mask from image before inpainting

export TRANSFORMERS_CACHE="/public/home/v-liuym/.cache/huggingface/hub"
conda activate lisa
CUDA_VISIBLE_DEVICES=0 python preprocess/lisa_ho_detect.py --seg_hand --skip --load_in_8bit --data_dir $DATA_DIR
CUDA_VISIBLE_DEVICES=0 python preprocess/lisa_ho_detect.py --skip --load_in_8bit --data_dir $DATA_DIR

Step 3: Inpaint

conda activate afford_diff
python preprocess/inpaint.py --data_dir $DATA_DIR --save_dir $DATA_DIR/obj_recon/ --img_folder images --inpaint --skip

Step 4: Segment inpainted obj get the inpainted mask

conda activate easyhoi
python preprocess/seg_image.py --data_dir $DATA_DIR

Step 5: Reconstruct obj

Use InstantMesh
conda activate instantmesh
export HUGGINGFACE_HUB_CACHE="/public/home/v-liuym/.cache/huggingface/hub"
python preprocess/instantmesh_gen.py preprocess/configs/instant-mesh-large.yaml $DATA_DIR
Use Tripo3D

To use Tripo3D for reconstruction, you need to generate an API key following the instructions in the Tripo AI Docs. Then replace the api_key in preprocess/tripo3d_gen.py with your own key. After updating the API key, execute the following command in your terminal:

python preprocess/tripo3d_gen.py --data_dir $DATA_DIR

Step 6: fix the object mesh, get watertight mesh

conda activate easyhoi
python preprocess/resample_mesh.py --data_dir $DATA_DIR [--resample]

Optimization

conda activate easyhoi
python src/optim_easyhoi.py -cn optim_teaser       #use instantmesh results
python src/optim_easyhoi.py -cn optim_teaser_tripo #use tripo3d results


🙏 Acknowledgements

We would like to express our gratitude to the authors and contributors of the following projects:

Citation

If you find our work useful, please consider citing us using the following BibTeX entry:

@inproceedings{liu2025easyhoi,
  title={EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild},
  author={Liu, Yumeng and Long, Xiaoxiao and Yang, Zemin and Liu, Yuan and Habermann, Marc and Theobalt, Christian and Ma, Yuexin and Wang, Wenping},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={7037--7047},
  year={2025}
}

Related Skills

View on GitHub
GitHub Stars99
CategoryDevelopment
Updated4d ago
Forks5

Languages

Python

Security Score

100/100

Audited on Mar 30, 2026

No findings