Volumegan
CVPR 2022 VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations
Install / Use
/learn @genforce/VolumeganREADME
VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations
Figure: Framework of VolumeGAN.
3D-aware Image Synthesis via Learning Structural and Textural Representations <br> Yinghao Xu, Sida Peng, Ceyuan Yang, Yujun Shen, Bolei Zhou <br> Computer Vision and Pattern Recognition (CVPR), 2022
[Paper] [Project Page] [Demo]
This paper aims at achieving high-fidelity 3D-aware images synthesis. We propose a novel framework, termed as VolumeGAN, for synthesizing images under different camera views, through explicitly learning a structural representation and a textural representation. We first learn a feature volume to represent the underlying structure, which is then converted to a feature field using a NeRF-like model. The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis. Such a design enables independent control of the shape and the appearance. Extensive experiments on a wide range of datasets show that our approach achieves sufficiently higher image quality and better 3D control than the previous methods.
Usage
Setup
This repository is based on Hammer, where you can find detailed instructions on environmental setup.
Test Demo
python render.py \
--work_dir ${WORK_DIR} \
--checkpoint ${MODEL_PATH} \
--num ${NUM} \
--seed ${SEED} \
--render_mode ${RENDER_MODE} \
--generate_html ${SAVE_HTML} \
volumegan-ffhq
where
WORK_DIRrefers to the path to save the results.MODEL_PATHrefers to the path of the pretrained model, regarding which we provideNUMrefers to the number of samples to synthesize.SEEDrefers to the random seed used for sampling.RENDER_MODErefers to the type of the rendered results, includingvideoandshape.SAVE_HTMLcontrols whether to save images as an HTML for better visualization when rendering videos.
Training
For example, users can use the following command to train VolumeGAN on FFHQ in the resolution of 256x256
./scripts/training_demos/volumegan_ffhq256.sh \
${NUM_GPUS} \
${DATA_PATH} \
[OPTIONS]
where
NUM_GPUSrefers to the number of GPUs used for training.DATA_PATHrefers to the path to the dataset (zipformat is strongly recommended).[OPTIONS]refers to any additional option to pass. Detailed instructions on available options can be found viapython train.py volumegan-ffhq --help.
NOTE: This demo script uses volumegan_ffhq256 as the default job_name, which is particularly used to identify experiments. Concretely, a directory with name job_name will be created under the root working directory, which is set as work_dirs/ by default. To prevent overwriting previous experiments, an exception will be raised to interrupt the training if the job_name directory has already existed. Please use --job_name=${JOB_NAME} option to specify a new job name.
Evaluation
Users can use the following command to evaluate a well-trained model
./scripts/test_metrics.sh \
${NUM_GPUS} \
${DATA_PATH} \
${MODEL_PATH} \
fid \
--G_kwargs '{"ps_kwargs":'{"perturb_mode":"none"}'}' \
[OPTIONS]
BibTeX
@inproceedings{xu2021volumegan,
title = {3D-aware Image Synthesis via Learning Structural and Textural Representations},
author = {Xu, Yinghao and Peng, Sida and Yang, Ceyuan and Shen, Yujun and Zhou, Bolei},
booktitle = {CVPR},
year = {2022}
}
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
flutter-tutor
Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
