EasyHOI

[CVPR2025] EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild

Generate Convert Improve

Install / Use

/learn @lym29/EasyHOI

About this skill

Quality Score

0/100

README

EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild

EasyHOI is a pipeline designed for reconstructing hand-object interactions from single-view images.

✅ TODO

[x] Provide the code for utilizing the Tripo3D API to improve reconstruction quality - Completed on 2024-12-24.
[x] Resolve issues in segmentation. - Completed on 2025-01-02
[ ] Integrate the code execution environments into one.
[ ] Complete a one-click demo.

📑 Table of Contents

Installation
Usage
- Initial Reconstruction of Hand and Object
- Prior-guided Optimization
Acknowledgements

🛠️ Installation

Download MANO models from the official website and place the mano folder inside the ./assets directory. After setting up, the directory structure should look like this:

assets/
├── anchor/
├── mano/
│ ├──models/
│ ├──webuser/
│ ├──__init__.py
│ ├──__LICENSE.txt
├── contact_zones.pkl
├── mano_backface_ids.pkl

Create the environment for optimization:

conda create -n easyhoi python=3.9
conda activate easyhoi
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda env update --file environment.yaml

Install pytorch3d follow the official instruction.

Install HaMeR and ViTPose:

cd third_party
git clone https://github.com/ViTAE-Transformer/ViTPose.git
cd ./hamer
pip install -e .[all]
cd ../ViTPose
pip install -v -e .

Additional Environments

Since I haven’t resolved the conflict between the environments yet, it’s necessary to create several virtual environments called afford_diff, lisa, and instantmesh. Please refer to the links below to set up these environments.

afford_diff: https://github.com/NVlabs/affordance_diffusion/blob/master/docs/install.md
lisa: https://github.com/dvlab-research/LISA
instantmesh: https://github.com/TencentARC/InstantMesh?tab=readme-ov-file

Thanks to the authors of these wonderful projects. I will resolve the environment conflicts as soon as possible and provide a more user-friendly demo.

🚀 Usage

Initial Reconstruction of the Hand and Object

Set the data directory by running the following command:

export DATA_DIR="./data"

Place your images in the $DATA_DIR/images folder. If you prefer a different path, ensure it contains a subfolder named images.

Step 1: Hand pose estimation, get hand mask from hamer

conda activate easyhoi
python preprocess/recon_hand.py --data_dir $DATA_DIR

Step 2: Segment hand mask and object mask from image before inpainting

export TRANSFORMERS_CACHE="/public/home/v-liuym/.cache/huggingface/hub"
conda activate lisa
CUDA_VISIBLE_DEVICES=0 python preprocess/lisa_ho_detect.py --seg_hand --skip --load_in_8bit --data_dir $DATA_DIR
CUDA_VISIBLE_DEVICES=0 python preprocess/lisa_ho_detect.py --skip --load_in_8bit --data_dir $DATA_DIR

Step 3: Inpaint

conda activate afford_diff
python preprocess/inpaint.py --data_dir $DATA_DIR --save_dir $DATA_DIR/obj_recon/ --img_folder images --inpaint --skip

Step 4: Segment inpainted obj get the inpainted mask

conda activate easyhoi
python preprocess/seg_image.py --data_dir $DATA_DIR

Step 5: Reconstruct obj

Use InstantMesh

conda activate instantmesh
export HUGGINGFACE_HUB_CACHE="/public/home/v-liuym/.cache/huggingface/hub"
python preprocess/instantmesh_gen.py preprocess/configs/instant-mesh-large.yaml $DATA_DIR

Use Tripo3D

To use Tripo3D for reconstruction, you need to generate an API key following the instructions in the Tripo AI Docs. Then replace the api_key in preprocess/tripo3d_gen.py with your own key. After updating the API key, execute the following command in your terminal:

python preprocess/tripo3d_gen.py --data_dir $DATA_DIR

Step 6: fix the object mesh, get watertight mesh

conda activate easyhoi
python preprocess/resample_mesh.py --data_dir $DATA_DIR [--resample]

Optimization

conda activate easyhoi
python src/optim_easyhoi.py -cn optim_teaser       #use instantmesh results
python src/optim_easyhoi.py -cn optim_teaser_tripo #use tripo3d results

🙏 Acknowledgements

We would like to express our gratitude to the authors and contributors of the following projects:

Citation

If you find our work useful, please consider citing us using the following BibTeX entry:

@inproceedings{liu2025easyhoi,
  title={EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild},
  author={Liu, Yumeng and Long, Xiaoxiao and Yang, Zemin and Liu, Yuan and Habermann, Marc and Theobalt, Christian and Ma, Yuexin and Wang, Wenping},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
  pages={7037--7047},
  year={2025}
}

Related Skills

node-connect

347.0k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

107.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

347.0k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

347.0k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。