Uvcgan2

UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation

Generate Convert Improve

Install / Use

/learn @LS4GAN/Uvcgan2

About this skill

Quality Score

0/100

README

UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation

Samples of Male to Female (Celeba-HQ), Wildlife to Cat (AFHQ), and Cat to Dog (AFHQ) translations obtained with UVCGANv2

This package provides reference implementation of the UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation [paper][uvcgan2_paper].

uvcgan2 builds upon the CycleGAN method for unpaired image-to-image transfer and improves its performance by modifying the generator, discriminator, and the training procedure.

This README file provides brief instructions about how to set up the uvcgan2 package and reproduce the paper results. To further facilitate the reproducibility we share the pre-trained models (c.f. section Pre-trained models)

The code of uvcgan2 is based on [pytorch-CycleGAN-and-pix2pix][cyclegan_repo] and [uvcgan][uvcgan_repo]. Please refer to the LICENSE section for the proper copyright attribution.

UPDATE (2023-09-22): Changed the arxiv preprint title:

from: ~"Rethinking CycleGAN: Improving Quality of GANs for Unpaired Image-to-Image Translation"~
to: "UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation Overview"

Applying UVCGANv2 to Your Dataset

This README file mainly describes the reproduction of the Rethinking CycleGAN [paper][uvcgan2_paper] results. If you would like to apply the uvcgan2 to some other dataset, please check out our accompanying repository [uvcgan4slats][uvcgan4slats]. It describes an application of uvcgan to a generic scientific dataset.

In short, the procedure to adapt the uvcgan2 to your problem is as follows:

Arrange your dataset to the format, similar to CelebA-HQ and AFHQ. For reference, the format of the CelebA-HQ directory is:

    CelebA-HQ/          # Name of the dataset
        train/
            male/       # Name of the first domain
            female/     # Name of the second domain
        val/
            male/
            female/

where the directories named male/ and female/ store the corresponding images. Arrange your dataset into a similar form, but choose appropriate names for the dataset directory and data domains.

Next, take an existing training script as a starting point. For instance, this one should work

scripts/celeba_hq/train_m2f_translation.py

The script contains a training configuration in the args_dict dictionary. The dictionary format should be rather self-explanatory. Modify the following parameters of the args_dict:

Modify data configuration to match your dataset.
Modify outdir parameter and set it to the path, where you want the output to be saved.
Modify transfer parameter and set it to None. Alternatively, check our [uvcgan4slats][uvcgan4slats] repository, if you want to pretrain the generators on a pretext task.

Use the instructions below to perform the model evaluation.

Installation & Requirements

Requirements

uvcgan2 models were trained under the official pytorch container pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime. A similar training environment can be constructed with conda

conda env create -f contrib/conda_env.yaml

The created conda environment can be activated with

conda activate uvcgan2

Installation

To install the uvcgan2 package one can simply run the following command

python3 setup.py develop --user

from the uvcgan2 source tree.

Environment Setup

By default, uvcgan2 will try to read datasets from the ./data directory and will save trained models under the ./outdir directory. If you would like to change this default behavior, set the two environment variables UVCGAN2_DATA and UVCGAN2_OUTDIR to the desired paths.

For instance, on UNIX-like system (Linux, MacOS) these variables can be set with:

export UVCGAN2_DATA=PATH_WHERE_DATA_IS_SAVED
export UVCGAN2_OUTDIR=PATH_TO_SAVE_MODELS_TO

UVCGANv2 Reproduction

To reproduce the results of the paper, the following workflow is suggested:

Download datasets (selfie2anime, celeba, celeba_hq, afhq).
Pre-process high-quality datasets.
Pre-train generators on an Inpainting pretext task.
Train CycleGAN models.
Generate translated images and evaluate KID/FID scores.

0. Pre-trained models

We provide pre-trained generators that were used to obtain the Rethinking CycleGAN [paper][uvcgan2_paper] results. They can be found on [Zenodo][pretrained_models].

uvcgan2 supplies a script ./scripts/download_model.sh to download the pre-trained models, e.g.

./scripts/download_model.sh afhq_cat2dog

The downloaded models will be unpacked under the ${UVCGAN_OUTDIR} with the default path as ./outdir.

1. Download Datasets

uvcgan2 provides a script (scripts/download_dataset.sh) to download and unpack various CycleGAN datasets.

NOTE: As of June 2023, the CelebA datasets (male2female and glasses) need to be recreated manually. Please refer to celeba4cyclegan for instructions on how to do that.

For example, one can use the following commands to download selfie2anime, CelebA male2female, CelebA eyeglasses, CelebA-HQ, and AFHQ datasets:

./scripts/download_dataset.sh selfie2anime
./scripts/download_dataset.sh male2female
./scripts/download_dataset.sh glasses
./scripts/download_dataset.sh celeba_all    # Low-resolution CelebA
./scripts/download_dataset.sh celeba_hq
./scripts/download_dataset.sh afhq

The downloaded datasets will be unpacked under the UVCGAN2_DATA directory (or ./data if UVCGAN2_DATA is unset).

2. Pre-processing High-Quality Datasets

The images of the high-quality datasets CelebA-HQ and AFHQ have sizes of 1024x1024 and 512x512 pixels correspondingly. For the training and evaluation, however, we have relied on images of size 256x256. The script scripts/downsize_right.py can be used to properly resize the images:

python3 ./scripts/downsize_right.py -s 256 256 -i lanczos "${UVCGAN2_DATA:-./data}/afhq/"      "${UVCGAN2_DATA:-./data}/afhq_resized_lanczos"
python3 ./scripts/downsize_right.py -s 256 256 -i lanczos "${UVCGAN2_DATA:-./data}/celeba_hq/" "${UVCGAN2_DATA:-./data}/celeba_hq_resized_lanczos"

3. Generator Pre-training

Once the datasets are ready, the next step is to pre-train generators on the Inpainting pretext task. uvcgan2 provides pre-training scripts for all the datasets:

scripts/afhq/pretrain_afhq.py
scripts/anime2selfie/pretrain_anime2selfie.py
scripts/celeba/pretrain_celeba.py
scripts/celeba_hq/pretrain_celebahq.py

These scripts can be simply run like

python3 scripts/afhq/pretrain_afhq.py

Optionally, they accept some command line arguments. For instance, the batch size can be adjusted by:

python3 scripts/afhq/pretrain_afhq.py --batch-size 8

More details can be found by looking over the scripts. Each of them contains a training configuration, which should be self-explanatory.

When the training is finished, the pre-trained generators will be saved under the ${UVCGAN2_OUTDIR} directory.

4. Image-to-Image Translation Training

For each of the translation directions, we provide a corresponding image translation training script:

scripts/afhq/train_cat2dog_translation.py
scripts/afhq/train_wild2cat_translation.py
scripts/afhq/train_wild2dog_translation.py
scripts/anime2selfie/train_anime2selfie_translation.py
scripts/celeba/train_celeba_glasses_translation.py
scripts/celeba/train_celeba_male2female_translation.py
scripts/celeba_hq/train_m2f_translation.py

Similar to the pre-training scripts, they can be simply run by

python3 scripts/afhq/train_cat2dog_translation.py

The trained models will be saved under the "${UVCGAN_OUTDIR}" directory.

5. Evaluation of the trained models

5.1 Image Translation

uvcgan2 provides a script scripts/translate_images.py to perform a batch translation of the images via one of the trained models. The script can be run as

python3 scripts/translate_images.py PATH_TO_TRAINED_MODEL --split SPLIT

where SPLIT is the split (train, val or test) of the data to translate. Due to how the datasets are constructed, one should use test split for the anime2selfie and CelebA datasets, and val split for the CelebA-HQ and AFHQ datasets.

The translated images will be saved under PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT.

5.2 Evaluation of the Quality of Translation

Rethinking CycleGAN paper describes two ways to evaluate the quality of translation:

Consistent protocol. Uniform across all datasets.
Ad-hoc protocols for CelebA-HQ and AFHQ.

5.2.1 Consistent Evaluation of the Quality of Translation

The consistent evaluation protocol relies on torch_fidelity (commit 5f7c5b5ccc4128bd79be2fdd8e75f118aa8fdc7c) to calculate KID/FID metrics of the translated images.

A helper script scripts/eval_fid.py is provided to facilitate such a calculation. It can be run with

python3 scripts/eval_fid.py `PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT` --kid-size KID_SIZE

where KID_SIZE is the parameter of the KID calculation algorithm. Its value depends on the dataset and should be set to match the Rethinking CycleGAN paper (c.f. Se

Related Skills

node-connect

354.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

112.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

354.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

354.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。