DIR

[ICCV 2023 Oral] Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image

Generate Convert Improve

Install / Use

/learn @PengfeiRen96/DIR

About this skill

Quality Score

0/100

README

<div align="center"> <h1>Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image</h1> <div> <a href='https://scholar.google.com/citations?user=TzpecsAAAAAJ' target='_blank'>Pengfei Ren<sup>1,2</sup></a>&emsp; <a href='https://scholar.google.com/citations?user=v8TFZI4AAAAJ' target='_blank'>Chao Wen<sup>2</sup></a>&emsp; <a href='https://scholar.google.com/citations?user=3hSD41oAAAAJ' target='_blank'>Xiaozheng Zheng<sup>1,2</sup></a>&emsp; <a href='https://scholar.google.com/citations?&user=ECKq3aUAAAAJ' target='_blank'>Zhou Xue<sup>2</sup></a> </br> <a href='https://scholar.google.com/citations?user=dwhbTsEAAAAJ' target='_blank'>Haifeng Sun<sup>1</sup></a>&emsp; <a href='https://scholar.google.com/citations?user=2W2h0SwAAAAJ' target='_blank'>Qi Qi<sup>1</sup></a>&emsp; <a href='https://scholar.google.com/citations?user=H441DjwAAAAJ' target='_blank'>Jingyu Wang<sup>1</sup></a>&emsp; <a href="https://dblp.org/pid/60/4951.html">Jianxin Liao<sup>1*</sup></a> </div> <div> <sup>1</sup>Beijing University of Posts and Telecommunications &emsp; <sup>2</sup>PICO IDL ByteDance &emsp; </div> <div> <sup>*</sup>Corresponding author </div> <div> :star_struck: <strong>Accepted to ICCV 2023 as Oral</strong> </div>

<strong> Our method DIR can achieve an accurate and robust reconstruction of interacting hands.</strong>

:open_book: For more visual results, go checkout our <a href="https://pengfeiren96.github.io/DIR/" target="_blank">project page</a>

<h4 align="center"> <a href="https://pengfeiren96.github.io/DIR/" target='_blank'>[Project Page]</a> • <a href="https://arxiv.org/abs/2302.02410" target='_blank'>[arXiv]</a> </h4> </div>

:mega: Updates

[10/2023] Released the pre-trained models 👏!

[07/2023] DIR is accepted to ICCV 2023 (Oral) :partying_face:!

:love_you_gesture: Citation

If you find our work useful for your research, please consider citing the paper:

@inproceedings{ren2023decoupled,
    title={Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image},
    author={Ren, Pengfei and Wen, Chao and Zheng, Xiaozheng and Xue, Zhou and Sun, Haifeng and Qi, Qi and Wang, Jingyu and Liao, Jianxin},
    booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    year={2023}
}

:desktop_computer: Data Preparation

Download necessary assets misc.tar.gz and unzip it.
Download InterHand2.6M dataset and unzip it.
Process the dataset by the code provided by IntagHand

python dataset/interhand.py --data_path PATH_OF_INTERHAND2.6M --save_path ./data/interhand2.6m/

:desktop_computer: Installation

Requirements

Python >= 3.8
PyTorch >= 1.10
pytorch3d >= 0.7.0
scikit-image==0.17.1
timm==0.6.11
trimesh==3.9.29
openmesh==1.1.3
pymeshlab==2021.7
chumpy
einops
imgaug
manopth

Setup with Conda

# create conda env
conda create -n dir python=3.8
# install torch
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
# install pytorch3d
pip install fvcore iopath
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1110/download.html
# install other requirements
cd DIR
pip install -r ./requirements.txt
# install manopth
cd manopth
pip install -e .

:train: Training

python train.py

:running_woman: Evaluation

Download the pre-trained models Google Drive

python apps/eval_interhand.py --data_path ./interhand2.6m/  --model ./checkpoint/xxx

You can use different joint id for alignment by setting root_joint (0: Wrist 9:MCP)

Set Wrist=0, you would get following output:

joint mean error:
    left: 10.732769034802914 mm, right: 9.722338989377022 mm
    all: 10.227554012089968 mm
vert mean error:
    left: 10.479239746928215 mm, right: 9.52134095132351 mm
    all: 10.000290349125862 mm
pixel joint mean error:
    left: 6.329594612121582 mm, right: 5.843323707580566 mm
    all: 6.086459159851074 mm
pixel vert mean error:
    left: 6.235759735107422 mm, right: 5.768411636352539 mm
    all: 6.0020856857299805 mm
root error: 29.26051989197731 mm

(We fixed some minor bugs and the performance is higher than the value reported in the paper)

:newspaper_roll: License

Distributed under the MIT License. See LICENSE for more information.

:raised_hands: Acknowledgements

The pytorch implementation of MANO is based on manopth. We use some parts of the great code from IntagHand. We thank the authors for their great job!

Related Skills

node-connect

343.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

90.0k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。