CPS
Implementation of Candidate Pool Strategy (CPS) from "GVGEN: Text-to-3D Generation with Volumetric Representation" in Pytorch.
Install / Use
/learn @SOTAMak1r/CPSREADME
GaussianVolume Fitting 🧊
Implementation of Candidate Pool Strategy (CPS) from <a href="https://arxiv.org/abs/2403.12957">GVGEN: Text-to-3D Generation with Volumetric Representation</a> in Pytorch.
<p align="center"> <img src="assets/cps.png" width=95%> <p>The purpose of CPS is to construct volumetric 3DGS representations for better distribution to aid in diffusion model training.
🐉 Procedure
- Step 1 : Environment Preparation
Follow intructions from gaussian splatting to prepare packages required for 3DGS fitting and objaverse-rendering for multi-view images dataset preparing.
- Step 2 : Multi-view Images Rendering
Get a multi-view image of an object by using the command in objaverse-rendering.
Put the obtained multi-view images into the datas folder and get the following data structure:
datas
├── obj_id_1
│ ├── 000.png
│ ├── 000.npy
│ ├── 001.png
│ ├── 001.npy
│ └── ...
│
├── obj_id_2
│ └── ...
│
└── obj_id_n
└── ...
To get better multi-view rendering, we provide our blender_script.py, which you can replace the corresponding script in objaverse-rendering.
- Step 3 : GaussianVolume Fitting
e.g.
obj_id=f1722ab650ad4d8dbe6fc4bf44e33d38
python train.py \
-w 1 \
--sh_degree 0 \
-s datas/${obj_id} \
-m output/${obj_id} \
--prepare_data #
- Step 4 : Checking
If you want to render the image, leave the prepare_data option out of Step 3.
python render.py \
-m output/${obj_id}
License
The majority of this project is licensed under MIT License. Portions of the project are available under separate license of referred projects, detailed in corresponding files.
BibTeX
@misc{he2024gvgentextto3dgenerationvolumetric,
title={GVGEN: Text-to-3D Generation with Volumetric Representation},
author={Xianglong He and Junyi Chen and Sida Peng and Di Huang and Yangguang Li and Xiaoshui Huang and Chun Yuan and Wanli Ouyang and Tong He},
year={2024},
eprint={2403.12957},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2403.12957},
}
Related Skills
node-connect
352.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
