GestureGAN
[ACM MM 2018 Oral] GestureGAN for Hand Gesture-to-Gesture Translation in the Wild
Install / Use
/learn @Ha0Tang/GestureGANREADME
Contents
- GestureGAN for Controllable Image-to-Image Translation
- Installation
- Dataset Preparation
- Generating Images Using Pretrained Model
- Training New Models
- Testing
- Code Structure
- Evaluation
- Acknowledgments
- Related Projects
- Citation
- Contributions
- Collaborations
GestureGAN for hand gesture-to-gesture tranlation task. Given an image and some novel hand skeletons, GestureGAN is able
to generate the same person but with different hand gestures.
GestureGAN for cross-view image tranlation task. Given an image and some novel semantic maps, GestureGAN is able
to generate the same scene but with different viewpoints.
GestureGAN for Controllable Image-to-Image Translation
GestureGAN Framework

Comparison with State-of-the-Art Image-to-Image Transaltion Methods

Conference paper | Extended paper | Project page | Slides | Poster
GestureGAN for Hand Gesture-to-Gesture Translation in the Wild.<br> Hao Tang<sup>1</sup>, Wei Wang<sup>1,2</sup>, Dan Xu<sup>1,3</sup>, Yan Yan<sup>4</sup> and Nicu Sebe<sup>1</sup>. <br> <sup>1</sup>University of Trento, Italy, <sup>2</sup>EPFL, Switzerland, <sup>3</sup>University of Oxford, UK, <sup>4</sup>Texas State University, USA.<br> In ACM MM 2018 (Oral & Best Paper Candidate).<br> The repository offers the official implementation of our paper in PyTorch.
License
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br /> Copyright (C) 2018 University of Trento, Italy.
All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)
The code is released for academic research use only. For commercial use, please contact bjdxtanghao@gmail.com.
Installation
Clone this repo.
git clone https://github.com/Ha0Tang/GestureGAN
cd GestureGAN/
This code requires PyTorch 0.4.1 and python 3.6+. Please install dependencies by
pip install -r requirements.txt (for pip users)
or
./scripts/conda_deps.sh (for Conda users)
To reproduce the results reported in the paper, you would need two NVIDIA GeForce GTX 1080 Ti GPUs or two NVIDIA TITAN Xp GPUs.
Dataset Preparation
For hand gesture-to-gesture translation tasks, we use NTU Hand Digit and Creative Senz3D datasets. For cross-view image translation task, we use Dayton and CVUSA datasets. These datasets must be downloaded beforehand. Please download them on the respective webpages. In addition, we put a few sample images in this code repo. Please cite their papers if you use the data.
Preparing NTU Hand Digit Dataset. The dataset can be downloaded in this paper. After downloading it we adopt OpenPose to generate hand skeletons and use them as training and testing data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here. Download images and the crossponding extracted hand skeletons of this dataset:
bash ./datasets/download_gesturegan_dataset.sh ntu_image_skeleton
Then run the following MATLAB script to generate training and testing data:
cd datasets/
matlab -nodesktop -nosplash -r "prepare_ntu_data"
Preparing Creative Senz3D Dataset. The dataset can be downloaded here. After downloading it we adopt OpenPose to generate hand skeletons and use them as training data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here. Download images and the crossponding extracted hand skeletons of this dataset:
bash ./datasets/download_gesturegan_dataset.sh senz3d_image_skeleton
Then run the following MATLAB script to generate training and testing data:
cd datasets/
matlab -nodesktop -nosplash -r "prepare_senz3d_data"
Preparing Dayton Dataset. The dataset can be downloaded here. In particular, you will need to download dayton.zip. Ground Truth semantic maps are not available for this datasets. We adopt RefineNet trained on CityScapes dataset for generating semantic maps and use them as training data in our experiments. Please cite their papers if you use this dataset. Train/Test splits for Dayton dataset can be downloaded from here.
Preparing CVUSA Dataset. The dataset can be downloaded here, which is from the page. After unzipping the dataset, prepare the training and testing data as discussed in SelectionGAN. We also convert semantic maps to the color ones by using this script. Since there is no semantic maps for the aerial images on this dataset, we use black images as aerial semantic maps for placehold purposes.
Or you can directly download the prepared Dayton and CVUSA data from here.
Preparing Your Own Datasets. Each training sample in the dataset will contain {Ix,Iy,Cx,Cy}, where Ix=image x, Iy=image y, Cx=Controllable structure of image x, and Cy=Controllable structure of image y. Of course, you can use GestureGAN for your own datasets and tasks, such landmark-guided facial experssion translation and keypoint-guided person image generation.
Generating Images Using Pretrained Model
Once the dataset is ready. The result images can be generated using pretrained models.
- You can download a pretrained model (e.g. ntu_gesturegan_twocycle) with the following script:
bash ./scripts/download_gesturegan_model.sh ntu_gesturegan_twocycle
The pretrained model is saved at ./checkpoints/[type]_pretrained. Check here for all the available GestureGAN models.
- Generate images using the pretrained model.
python test.py --dataroot [path_to_dataset] \
--name [type]_pretrained \
--model [gesturegan_model] \
--which_model_netG resnet_9blocks \
--which_direction AtoB \
--dataset_mode aligned \
--norm instance \
--gpu_ids 0 \
--batchSize [BS] \
--loadSize [LS] \
--fineSize [FS] \
--no_flip
[path_to_dataset] is the path to the dataset. Dataset can be one of ntu, senz3d, dayton_a2g, dayton_g2a and cvusa. [type]_pretrained is the directory name of the checkpoint file downloaded in Step 1, which should be one of ntu_gesturegan_twocycle_pretrained, `senz3d_gesturegan_twocycle_pretra
