ALAE
[CVPR2020] Adversarial Latent Autoencoders
Install / Use
/learn @podgorskiy/ALAEREADME
ALAE
Adversarial Latent Autoencoders<br> Stanislav Pidhorskyi, Donald Adjeroh, Gianfranco Doretto<br>
Abstract: Autoencoder networks are unsupervised approaches aiming at combining generative and representational properties by learning simultaneously an encoder-generator map. Although studied extensively, the issues of whether they have the same generative power of GANs, or learn disentangled representations, have not been fully addressed. We introduce an autoencoder that tackles these issues jointly, which we call Adversarial Latent Autoencoder (ALAE). It is a general architecture that can leverage recent improvements on GAN training procedures. We designed two autoencoders: one based on a MLP encoder, and another based on a StyleGAN generator, which we call StyleALAE. We verify the disentanglement properties of both architectures. We show that StyleALAE can not only generate 1024x1024 face images with comparable quality of StyleGAN, but at the same resolution can also produce face reconstructions and manipulations based on real images. This makes ALAE the first autoencoder able to compare with, and go beyond the capabilities of a generator-only type of architecture.
Citation
- Stanislav Pidhorskyi, Donald A. Adjeroh, and Gianfranco Doretto. Adversarial Latent Autoencoders. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [to appear]
@InProceedings{pidhorskyi2020adversarial,
author = {Pidhorskyi, Stanislav and Adjeroh, Donald A and Doretto, Gianfranco},
booktitle = {Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)},
title = {Adversarial Latent Autoencoders},
year = {2020},
note = {[to appear]},
}
<h4 align="center">preprint on arXiv: <a href="https://arxiv.org/abs/2004.04467">2004.04467</a></h4>
To run the demo
To run the demo, you will need to have a CUDA capable GPU, PyTorch >= v1.3.1 and cuda/cuDNN drivers installed. Install the required packages:
pip install -r requirements.txt
Download pre-trained models:
python training_artifacts/download_all.py
Run the demo:
python interactive_demo.py
You can specify yaml config to use. Configs are located here: https://github.com/podgorskiy/ALAE/tree/master/configs.
By default, it uses one for FFHQ dataset.
You can change the config using -c parameter. To run on celeb-hq in 256x256 resolution, run:
python interactive_demo.py -c celeba-hq256
However, for configs other then FFHQ, you need to obtain new principal direction vectors for the attributes.
Repository organization
Running scripts
The code in the repository is organized in such a way that all scripts must be run from the root of the repository. If you use an IDE (e.g. PyCharm or Visual Studio Code), just set Working Directory to point to the root of the repository.
If you want to run from the command line, then you also need to set PYTHONPATH variable to point to the root of the repository.
For example, let's say we've cloned repository to ~/ALAE directory, then do:
$ cd ~/ALAE
$ export PYTHONPATH=$PYTHONPATH:$(pwd)
Now you can run scripts as follows:
$ python style_mixing/stylemix.py
Repository structure
| Path | Description
| :--- | :----------
| ALAE | Repository root folder
| ├ configs | Folder with yaml config files.
| │ ├ bedroom.yaml | Config file for LSUN bedroom dataset at 256x256 resolution.
| │ ├ celeba.yaml | Config file for CelebA dataset at 128x128 resolution.
| │ ├ celeba-hq256.yaml | Config file for CelebA-HQ dataset at 256x256 resolution.
| │ ├ celeba_ablation_nostyle.yaml | Config file for CelebA 128x128 dataset for ablation study (no styles).
| │ ├ celeba_ablation_separate.yaml | Config file for CelebA 128x128 dataset for ablation study (separate encoder and discriminator).
| │ ├ celeba_ablation_z_reg.yaml | Config file for CelebA 128x128 dataset for ablation study (regress in Z space, not W).
| │ ├ ffhq.yaml | Config file for FFHQ dataset at 1024x1024 resolution.
| │ ├ mnist.yaml | Config file for MNIST dataset using Style architecture.
| │ └ mnist_fc.yaml | Config file for MNIST dataset using only fully connected layers (Permutation Invariant MNIST).
| ├ dataset_preparation | Folder with scripts for dataset preparation.
| │ ├ prepare_celeba_hq_tfrec.py | To prepare TFRecords for CelebA-HQ dataset at 256x256 resolution.
| │ ├ prepare_celeba_tfrec.py | To prepare TFRecords for CelebA dataset at 128x128 resolution.
| │ ├ prepare_mnist_tfrec.py | To prepare TFRecords for MNIST dataset.
| │ ├ split_tfrecords_bedroom.py | To split official TFRecords from StyleGAN paper for LSUN bedroom dataset.
| │ └ split_tfrecords_ffhq.py | To split official TFRecords from StyleGAN paper for FFHQ dataset.
| ├ dataset_samples | Folder with sample inputs for different datasets. Used for figures and for test inputs during training.
| ├ make_figures | Scripts for making various figures.
| ├ metrics | Scripts for computing metrics.
| ├ principal_directions | Scripts for computing principal direction vectors for various attributes. For interactive demo.
| ├ style_mixing | Sample inputs and script for producing style-mixing figures.
| ├ training_artifacts | Default place for saving checkpoints/sample outputs/plots.
| │ └ download_all.py | Script for downloading all pretrained models.
| ├ interactive_demo.py | Runnable script for interactive demo.
| ├ train_alae.py | Runnable script for training.
| ├ train_alae_separate.py | Runnable script for training for ablation study (separate encoder and discriminator).
| ├ checkpointer.py | Module for saving/restoring model weights, optimizer state and loss history.
| ├ custom_adam.py | Customized adam optimizer for learning rate equalization and zero second beta.
| ├ dataloader.py | Module with dataset classes, loaders, iterators, etc.
| ├ defaults.py | Definition for config variables with default values.
| ├ launcher.py | Helper for running multi-GPU, multiprocess training. Sets up config and logging.
| ├ lod_driver.py | Helper class for managing growing/stabilizing network.
| ├ lreq.py | Custom Linear, Conv2d and ConvTranspose2d modules for learning rate equalization.
| ├ model.py | Module with high-level model definition.
| ├ model_separate.py | Same as above, but for ablation study.
| ├ net.py | Definition of all network blocks for multiple architectures.
| ├ registry.py | Registry of network blocks for selecting from config file.
| ├ scheduler.py | Custom schedulers with warm start and aggregating several optimizers.
| ├ tracker.py | Module for plotting losses.
| └ utils.py | Decorator for async call, decorator for caching, registry for network blocks.
Configs
In this codebase yacs is used to handle configurations.
Most of the runnable scripts accept -c parameter that can specify config files to use.
For example, to make reconstruction figures, you can run:
python make_figures/make_recon_figure_paged.py
python make_figures/make_recon_figure_paged.py -c celeba
python make_figures/make_recon_figure_paged.py -c celeba-hq256
python make_figures/make_recon_figure_paged.py -c bedroom
The Default config is ffhq.
Datasets
Training is done using TFRecords. TFRecords are read using DareBlopy, which allows using them with Pytorch.
In config files as well as in all preparation scripts, it is assumed that all datasets are in /data/datasets/. You can either change path in config files, either create a symlink to where you store datasets.
The official way of generating CelebA-HQ can be challenging. Please refer to this page: https://github.com/suvojit-0x55aa/celebA-HQ-dataset-download You can get the pre-generated dataset from: https:
