Pix2latent
Code for: Transforming and Projecting Images into Class-conditional Generative Networks
Install / Use
/learn @minyoungg/Pix2latentREADME
pix2latent: framework for inverting images into generative models
Framework for inverting images. Codebase used in:
Transforming and Projecting Images into Class-conditional Generative Networks
project page | paper
Minyoung Huh Richard Zhang Jun-Yan Zhu Sylvain Paris Aaron Hertzmann
MIT CSAIL Adobe Research
ECCV 2020 (oral)

@inproceedings{huh2020ganprojection,
title = {Transforming and Projecting Images to Class-conditional Generative Networks},
author = {Minyoung Huh and Richard Zhang, Jun-Yan Zhu and Sylvain Paris and Aaron Hertzmann},
booktitle = {ECCV},
year = {2020}
}
NOTE [8/25/20] The codebase has been renamed from GAN-Transform-and-Project to pix2latent, and also refactored to make it easier to use and extend to any generative model beyond BigGAN. To access the original codebase refer to the legacy branch.
Example results
<b>All results below are without fine-tuning.</b>
<p align="center"><b> BigGAN (z-space) - ImageNet (256x256) </b></p>


Prerequisites
The code was developed on
- Ubuntu 18.04
- Python 3.7
- PyTorch 1.4.0
Getting Started
-
<b>Install PyTorch</b>
Install the correct PyTorch version for your machine -
<b>Install the python dependencies</b>
Install the remaining dependencies viapip install -r requirements.txt -
<b>Install pix2latent</b>
git clone https://github.com/minyoungg/pix2latent cd pix2latent pip install .
Examples
We provide several demo codes in ./examples/ for both BigGAN and StyleGAN2. Note that the codebase has been tuned and developed on BigGAN.
> cd examples
> python invert_biggan_adam.py --num_samples 4
Using the make_video flag will save the optimization trajectory as a video.
> python invert_biggan_adam.py --make_video --num_samples 4
(slow) To optimize with CMA-ES or BasinCMA, we use PyCMA. Note that the PyCMA version of CMA-ES has a predefined number of samples to jointly evaluate (18 for BigGAN) and (22 for StyleGAN2).
> python invert_biggan_cma.py
> python invert_biggan_basincma.py
(fast) Alternatively CMA-ES in Nevergrad provides sample parallelization so you can set your own number of samples. Although this runs faster, we have observed the performance to be slightly worse. (warning: performance depends on num_samples).
> python invert_biggan_nevergrad.py --ng_method CMA --num_samples 4
> python invert_biggan_hybrid_nevergrad.py --ng_method CMA --num_samples 4
Same applies to StyleGAN2. See ./examples/ for extensive list of examples.
Template pseudocode
import torch, torch.nn as nn
import pix2latent.VariableManger
from pix2latent.optimizer import GradientOptimizer
# load your favorite model
class Generator(nn.Module):
...
def forward(self, z):
...
return im
model = Generator()
# define your loss objective .. or use the predefined loss functions in pix2latent.loss_functions
loss_fn = lambda out, target: (target - out).abs().mean()
# tell the optimizer what the input-output relationship is
vm = VariableManager()
vm.register(variable_name='z', shape=(128,), var_type='input')
vm.register(variable_name='target', shape(3, 256, 256), var_type='output')
# setup optimizer
opt = GradientOptimizer(model, vm, loss_fn)
# optimize
vars, out, loss = opt.optimize(num_samples=1, grad_steps=500)
detailed usage
pix2latent
| Command | Description |
| --- | --- |
| pix2latent.loss_function | predefined loss functions |
| pix2latent.distribution | distribution functions used to initialize variables |
pix2latent.VariableManger
class variable for managing variables. variable manager instance is initialized by
var_man = VariableManager()
| Method | Description |
| --- | --- |
| var_man.register(...) | registers variable. this variable is created when initialize is called |
| var_man.unregister(...) | removes a variable that is already registered |
| var_man.edit_variable(...) | edits existing variable |
| var_man.initialize(...) | initializes variable from defined specification |
pix2latent.optimizer
| Command | Description |
| --- | --- |
| pix2latent.optimizer.GradientOptimizer | gradient-based optimizer. defaults to optimizer defined in pix2latent.VariableManager|
| pix2latent.optimizer.CMAOptimizer | uses CMA optimization to search over latent variables z|
| pix2latent.optimizer.BasinCMAOptimizer | uses BasinCMA optimization. a combination of CMA and gradient-based optimization|
| pix2latent.optimizer.NevergradOptimizer | uses Nevergrad library for optimization. supports most gradient-free optimization method implemented in Nevergrad |
| pix2latent.optimizer.HybridNevergradOptimizer | uses hybrid optimization by alternating gradient and gradient-free optimization provided by Nevergrad|
pix2latent.transform
| Command | Description |
| --- | --- |
| pix2latent.SpatialTransform | spatial transformation function, used to optimize for image scale and position |
| pix2latent.TransformBasinCMAOptimizer | BasinCMA-like optimization method used to search for image transformation |
pix2latent.util
| Command | Description |
| --- | --- |
| pix2latent.util.image | utility for image pre and post processing |
| pix2latent.util.video | utility for video (e.g. saving videos) |
| pix2latent.util.misc | miscellaneous functions |
| pix2latent.util.function_hooks | function hooks that can be attached to variables in the optimization loop. (e.g. Clamp, Perturb) |
pix2latent.model
| Command | Description |
| --- | --- |
| pix2latent.model.BigGAN | BigGAN model wrapper. Uses implementation by huggingface using the official weights|
| pix2latent.model.StyleGAN2 | StyleGAN2 model wrapper. Uses PyTorch implementation by rosinality using the official weights|
pix2latent.edit
| Command | Description |
| --- | --- |
| pix2latent.edit.BigGANLatentEditor | BigGAN editor. Simple interface to edit class and latent variables using oversimplified version of GANSpace |
Related Skills
node-connect
339.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
