Uvcgan2
UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation
Install / Use
/learn @LS4GAN/Uvcgan2README
UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation
<p align="center"> <img src="https://github.com/LS4GAN/gallery/blob/main/uvcgan2/animations/male2female_v2.webp" width="95%" title="male to female translation with CelebA-HQ"> </p> <p align="center"> <img src="https://github.com/LS4GAN/gallery/blob/main/uvcgan2/animations/wild2cat_v2.webp" width="95%" title="wild to cat translation with AFHQ"> </p> <p align="center"> <img src="https://github.com/LS4GAN/gallery/blob/main/uvcgan2/animations/cat2dog_v2.webp" width="95%" title="cat to dog translation with AFHQ"> </p>Samples of Male to Female (Celeba-HQ), Wildlife to Cat (AFHQ), and Cat to Dog (AFHQ) translations obtained with UVCGANv2
This package provides reference implementation of the UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation
[paper][uvcgan2_paper].
uvcgan2 builds upon the CycleGAN method for unpaired image-to-image transfer
and improves its performance by modifying the generator, discriminator, and the
training procedure.
This README file provides brief instructions about how to set up the uvcgan2
package and reproduce the paper results. To further facilitate the
reproducibility we share the pre-trained models
(c.f. section Pre-trained models)
The code of uvcgan2 is based on [pytorch-CycleGAN-and-pix2pix][cyclegan_repo]
and [uvcgan][uvcgan_repo]. Please refer to the LICENSE section for the proper
copyright attribution.
UPDATE (2023-09-22): Changed the arxiv preprint title:
- from: ~"Rethinking CycleGAN: Improving Quality of GANs for Unpaired Image-to-Image Translation"~
- to: "UVCGAN v2: An Improved Cycle-Consistent GAN for Unpaired Image-to-Image Translation Overview"
Applying UVCGANv2 to Your Dataset
This README file mainly describes the reproduction of the Rethinking CycleGAN
[paper][uvcgan2_paper] results. If you would like to apply the uvcgan2 to
some other dataset, please check out our accompanying repository
[uvcgan4slats][uvcgan4slats]. It describes an application of uvcgan to a
generic scientific dataset.
In short, the procedure to adapt the uvcgan2 to your problem is as follows:
- Arrange your dataset to the format, similar to CelebA-HQ and AFHQ. For reference, the format of the CelebA-HQ directory is:
CelebA-HQ/ # Name of the dataset
train/
male/ # Name of the first domain
female/ # Name of the second domain
val/
male/
female/
where the directories named male/ and female/ store the corresponding
images. Arrange your dataset into a similar form, but choose appropriate
names for the dataset directory and data domains.
- Next, take an existing training script as a starting point. For instance, this one should work
scripts/celeba_hq/train_m2f_translation.py
The script contains a training configuration in the args_dict
dictionary. The dictionary format should be rather self-explanatory.
Modify the following parameters of the args_dict:
- Modify
dataconfiguration to match your dataset. - Modify
outdirparameter and set it to the path, where you want the output to be saved. - Modify
transferparameter and set it toNone. Alternatively, check our [uvcgan4slats][uvcgan4slats] repository, if you want to pretrain the generators on a pretext task.
- Use the instructions below to perform the model evaluation.
Installation & Requirements
Requirements
uvcgan2 models were trained under the official pytorch container
pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime. A similar training
environment can be constructed with conda
conda env create -f contrib/conda_env.yaml
The created conda environment can be activated with
conda activate uvcgan2
Installation
To install the uvcgan2 package one can simply run the following command
python3 setup.py develop --user
from the uvcgan2 source tree.
Environment Setup
By default, uvcgan2 will try to read datasets from the ./data directory
and will save trained models under the ./outdir directory. If you would
like to change this default behavior, set the two environment variables
UVCGAN2_DATA and UVCGAN2_OUTDIR to the desired paths.
For instance, on UNIX-like system (Linux, MacOS) these variables can be set with:
export UVCGAN2_DATA=PATH_WHERE_DATA_IS_SAVED
export UVCGAN2_OUTDIR=PATH_TO_SAVE_MODELS_TO
UVCGANv2 Reproduction
To reproduce the results of the paper, the following workflow is suggested:
- Download datasets (
selfie2anime,celeba,celeba_hq,afhq). - Pre-process high-quality datasets.
- Pre-train generators on an Inpainting pretext task.
- Train CycleGAN models.
- Generate translated images and evaluate KID/FID scores.
0. Pre-trained models
We provide pre-trained generators that were used to obtain the Rethinking CycleGAN [paper][uvcgan2_paper] results.
They can be found on [Zenodo][pretrained_models].
uvcgan2 supplies a script ./scripts/download_model.sh to download
the pre-trained models, e.g.
./scripts/download_model.sh afhq_cat2dog
The downloaded models will be unpacked under the ${UVCGAN_OUTDIR} with the default path as ./outdir.
1. Download Datasets
uvcgan2 provides a script (scripts/download_dataset.sh) to download and
unpack various CycleGAN datasets.
NOTE: As of June 2023, the CelebA datasets (male2female and glasses)
need to be recreated manually. Please refer to
celeba4cyclegan for instructions
on how to do that.
For example, one can use the following commands to download selfie2anime,
CelebA male2female, CelebA eyeglasses, CelebA-HQ, and AFHQ datasets:
./scripts/download_dataset.sh selfie2anime
./scripts/download_dataset.sh male2female
./scripts/download_dataset.sh glasses
./scripts/download_dataset.sh celeba_all # Low-resolution CelebA
./scripts/download_dataset.sh celeba_hq
./scripts/download_dataset.sh afhq
The downloaded datasets will be unpacked under the UVCGAN2_DATA directory
(or ./data if UVCGAN2_DATA is unset).
2. Pre-processing High-Quality Datasets
The images of the high-quality datasets CelebA-HQ and AFHQ have sizes
of 1024x1024 and 512x512 pixels correspondingly. For the training and
evaluation, however, we have relied on images of size 256x256. The script
scripts/downsize_right.py can be used to properly resize the images:
python3 ./scripts/downsize_right.py -s 256 256 -i lanczos "${UVCGAN2_DATA:-./data}/afhq/" "${UVCGAN2_DATA:-./data}/afhq_resized_lanczos"
python3 ./scripts/downsize_right.py -s 256 256 -i lanczos "${UVCGAN2_DATA:-./data}/celeba_hq/" "${UVCGAN2_DATA:-./data}/celeba_hq_resized_lanczos"
3. Generator Pre-training
Once the datasets are ready, the next step is to pre-train generators on the
Inpainting pretext task. uvcgan2 provides pre-training scripts for all
the datasets:
scripts/afhq/pretrain_afhq.py
scripts/anime2selfie/pretrain_anime2selfie.py
scripts/celeba/pretrain_celeba.py
scripts/celeba_hq/pretrain_celebahq.py
These scripts can be simply run like
python3 scripts/afhq/pretrain_afhq.py
Optionally, they accept some command line arguments. For instance, the batch size can be adjusted by:
python3 scripts/afhq/pretrain_afhq.py --batch-size 8
More details can be found by looking over the scripts. Each of them contains a training configuration, which should be self-explanatory.
When the training is finished, the pre-trained generators will be saved under
the ${UVCGAN2_OUTDIR} directory.
4. Image-to-Image Translation Training
For each of the translation directions, we provide a corresponding image translation training script:
scripts/afhq/train_cat2dog_translation.py
scripts/afhq/train_wild2cat_translation.py
scripts/afhq/train_wild2dog_translation.py
scripts/anime2selfie/train_anime2selfie_translation.py
scripts/celeba/train_celeba_glasses_translation.py
scripts/celeba/train_celeba_male2female_translation.py
scripts/celeba_hq/train_m2f_translation.py
Similar to the pre-training scripts, they can be simply run by
python3 scripts/afhq/train_cat2dog_translation.py
The trained models will be saved under the "${UVCGAN_OUTDIR}" directory.
5. Evaluation of the trained models
5.1 Image Translation
uvcgan2 provides a script scripts/translate_images.py to perform a batch
translation of the images via one of the trained models. The script can
be run as
python3 scripts/translate_images.py PATH_TO_TRAINED_MODEL --split SPLIT
where SPLIT is the split (train, val or test) of the data to translate.
Due to how the datasets are constructed, one should use test split for the
anime2selfie and CelebA datasets, and val split for the CelebA-HQ
and AFHQ datasets.
The translated images will be saved under
PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT.
5.2 Evaluation of the Quality of Translation
Rethinking CycleGAN paper describes two ways to evaluate the quality of
translation:
- Consistent protocol. Uniform across all datasets.
- Ad-hoc protocols for
CelebA-HQandAFHQ.
5.2.1 Consistent Evaluation of the Quality of Translation
The consistent evaluation protocol relies on torch_fidelity (commit 5f7c5b5ccc4128bd79be2fdd8e75f118aa8fdc7c) to calculate KID/FID metrics of the translated images.
A helper script scripts/eval_fid.py is provided to facilitate such
a calculation. It can be run with
python3 scripts/eval_fid.py `PATH_TO_TRAINED_MODEL/evals/final/images_eval-SPLIT` --kid-size KID_SIZE
where KID_SIZE is the parameter of the KID calculation algorithm. Its value
depends on the dataset and should be set to match the Rethinking CycleGAN
paper (c.f. Se
Related Skills
node-connect
354.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
112.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
354.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
354.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
