CPS

Implementation of Candidate Pool Strategy (CPS) from "GVGEN: Text-to-3D Generation with Volumetric Representation" in Pytorch.

Generate Convert Improve

Install / Use

/learn @SOTAMak1r/CPS

About this skill

Quality Score

0/100

README

GaussianVolume Fitting 🧊

Implementation of Candidate Pool Strategy (CPS) from <a href="https://arxiv.org/abs/2403.12957">GVGEN: Text-to-3D Generation with Volumetric Representation</a> in Pytorch.

The purpose of CPS is to construct volumetric 3DGS representations for better distribution to aid in diffusion model training.

🐉 Procedure

Step 1 : Environment Preparation

Follow intructions from gaussian splatting to prepare packages required for 3DGS fitting and objaverse-rendering for multi-view images dataset preparing.

Step 2 : Multi-view Images Rendering

Get a multi-view image of an object by using the command in objaverse-rendering. Put the obtained multi-view images into the datas folder and get the following data structure:

datas
├── obj_id_1
│   ├── 000.png
│   ├── 000.npy
│   ├── 001.png
│   ├── 001.npy
│   └── ...
│
├── obj_id_2
│   └── ...
│
└── obj_id_n
    └── ...

To get better multi-view rendering, we provide our blender_script.py, which you can replace the corresponding script in objaverse-rendering.

Step 3 : GaussianVolume Fitting

e.g.

obj_id=f1722ab650ad4d8dbe6fc4bf44e33d38
python train.py \
    -w 1 \
    --sh_degree 0 \
    -s datas/${obj_id} \
    -m output/${obj_id} \
    --prepare_data #

Step 4 : Checking

If you want to render the image, leave the prepare_data option out of Step 3.

python render.py \
    -m output/${obj_id}

License

The majority of this project is licensed under MIT License. Portions of the project are available under separate license of referred projects, detailed in corresponding files.

BibTeX

@misc{he2024gvgentextto3dgenerationvolumetric,
      title={GVGEN: Text-to-3D Generation with Volumetric Representation}, 
      author={Xianglong He and Junyi Chen and Sida Peng and Di Huang and Yangguang Li and Xiaoshui Huang and Chun Yuan and Wanli Ouyang and Tong He},
      year={2024},
      eprint={2403.12957},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2403.12957}, 
}

Related Skills

node-connect

352.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。