Hyperstyle
Official Implementation for "HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing" (CVPR 2022) https://arxiv.org/abs/2111.15666
Install / Use
/learn @yuval-alaluf/HyperstyleREADME
HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing (CVPR 2022)
Yuval Alaluf*, Omer Tov*, Ron Mokady, Rinon Gal, Amit H. Bermano
*Denotes equal contributionThe inversion of real images into StyleGAN's latent space is a well-studied problem. Nevertheless, applying existing approaches to real-world scenarios remains an open challenge, due to an inherent trade-off between reconstruction and editability: latent space regions which can accurately represent real images typically suffer from degraded semantic control. Recent work proposes to mitigate this trade-off by fine-tuning the generator to add the target image to well-behaved, editable regions of the latent space. While promising, this fine-tuning scheme is impractical for prevalent use as it requires a lengthy training phase for each new image. In this work, we introduce this approach into the realm of encoder-based inversion. We propose HyperStyle, a hypernetwork that learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space. A naive modulation approach would require training a hypernetwork with over three billion parameters. Through careful network design, we reduce this to be in line with existing encoders. HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders. Lastly, we demonstrate HyperStyle's effectiveness on several applications beyond the inversion task, including the editing of out-of-domain images which were never seen during training.
<a href="https://arxiv.org/abs/2111.15666"><img src="https://img.shields.io/badge/arXiv-2111.15666-b31b1b.svg" height=22.5></a> <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" height=22.5></a>
<a href="https://youtu.be/_sbXmLY2jMw"><img src="https://img.shields.io/static/v1?label=CVPR 2022&message=5 Minute Video&color=red" height=22.5></a>
Inference Notebook: <a href="http://colab.research.google.com/github/yuval-alaluf/hyperstyle/blob/master/notebooks/inference_playground.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=20></a>
Animation Notebook: <a href="http://colab.research.google.com/github/yuval-alaluf/hyperstyle/blob/master/notebooks/animations_playground.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=20></a>
Domain Adaptation Notebook: <a href="http://colab.research.google.com/github/yuval-alaluf/hyperstyle/blob/master/notebooks/domain_adaptation_playground.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" height=20></a>
Description
Official Implementation of our HyperStyle paper for both training and evaluation. HyperStyle introduces a new approach for learning to efficiently modify a pretrained StyleGAN generator based on a given target image through the use of hypernetworks.
Table of Contents
- Getting Started
- Pretrained HyperStyle Models
- Training
- Inference
- Domain Adaptation
- Repository structure
- Related Works
- Credits
- Acknowledgments
- Citation
Getting Started
Prerequisites
- Linux or macOS
- NVIDIA GPU + CUDA CuDNN (CPU may be possible with some modifications, but is not inherently supported)
- Python 3
Installation
- Dependencies: We recommend running this repository using Anaconda.
All dependencies for defining the environment are provided inenvironment/hyperstyle_env.yaml.
Pretrained HyperStyle Models
In this repository, we provide pretrained HyperStyle models for various domains.
All models make use of a modified, pretrained e4e encoder for obtaining an initial inversion into the W latent space.
Please download the pretrained models from the following links.
| Path | Description | :--- | :---------- |Human Faces | HyperStyle trained on the FFHQ dataset. |Cars | HyperStyle trained on the Stanford Cars dataset. |Wild | HyperStyle trained on the AFHQ Wild dataset.
Auxiliary Models
In addition, we provide various auxiliary models needed for training your own HyperStyle models from scratch.
These include the pretrained e4e encoders into W, pretrained StyleGAN2 generators, and models used for loss computation.
Pretrained W-Encoders
| Path | Description | :--- | :---------- |Faces W-Encoder | Pretrained e4e encoder trained on FFHQ into the W latent space. |Cars W-Encoder | Pretrained e4e encoder trained on Stanford Cars into the W latent space. |Wild W-Encoder | Pretrained e4e encoder trained on AFHQ Wild into the W latent space.
<br>StyleGAN2 Generators
| Path | Description | :--- | :---------- |FFHQ StyleGAN | StyleGAN2 model trained on FFHQ with 1024x1024 output resolution. |LSUN Car StyleGAN | StyleGAN2 model trained on LSUN Car with 512x384 output resolution. |AFHQ Wild StyleGAN | StyleGAN-ADA model trained on AFHQ Wild with 512x512 output resolution. |Toonify | Toonify generator from Doron Adler and Justin Pinkney converted to Pytorch using rosinality's conversion script, used in domain adaptation. |Pixar | Pixar generator from StyleGAN-NADA used in domain adaptation.
Note: all StyleGAN models are converted from the official TensorFlow models to PyTorch using the conversion script from rosinality.
<br>Other Utility Models
| Path | Description | :--- | :---------- |IR-SE50 Model | Pretrained IR-SE50 model taken from TreB1eN for use in our ID loss and encoder backbone on human facial domain. |ResNet-34 Model | ResNet-34 model trained on ImageNet taken from torchvision for initializing our encoder backbone. |MoCov2 Model | Pretrained ResNet-50 model trained using MOCOv2 for computing MoCo-based loss on non-facial domains. The model is taken from the official implementation. |CurricularFace Backbone | Pretrained CurricularFace model taken from HuangYG123 for use in ID similarity metric computation. |MTCNN | Weights for MTCNN model taken from TreB1eN for use in ID similarity metric computation. (Unpack the tar.gz to extract the 3 model weights.)
By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models.
However, you may use your own paths by changing the necessary values in configs/path_configs.py.
Training
Preparing your Data
In order to train HyperStyle on your own data, you should perform the following steps:
- Update
configs/paths_config.pywith the necessary data paths and model paths for training and inference.
dataset_paths = {
'train_data': '/path/to/train/data'
'test_data': '/path/t
