Pvsm
Official code release for the PVSM paper: "From Rays to Projections: Better Inputs for Feed-Forward View Synthesis"
Install / Use
/learn @wuzirui/PvsmREADME
Preprint 2025
</div>Installation
conda create -n pvsm python=3.11
conda activate pvsm
# Install torch torchvision based on your environment configurations
pip install -r requirements.txt
There's a known issue of the current release of gsplat==1.5.3, so please install gsplat via source for now:
# Install gsplat from source
pip install git+https://github.com/nerfstudio-project/gsplat.git
Quick Start
Download Checkpoints
Download DINOv3-ViT-B and place it under metric_checkpoints/;
Download our pre-trained model checkpoints:
After downloading, organize your checkpoints directory as follows:
metric_checkpoints/
├── pvsm_finetuned_full.pt # Our trained full 24-layer model
├── pvsm_finetuned_small.pt # Our trained smaller 12-layer model
├── dinov3-vitb16-pretrain-lvd1689m # DINOv3 Checkpoint
│ ├── config.json
│ ├── LICENSE.md
│ ├── model.safetensors
│ ├── preprocessor_config.json
│ └── README.md
├── imagenet-vgg-verydeep-19.mat # (Optional) for training
└── map-anything # (Optional) for dataset generation
├── config.json
├── model.safetensors
└── README.md
Interactive Demo
For a quick interactive demo, please follow the instruction and unzip the downloaded example data (22.3 MB) to your local machine.
To launch the interactive web-based demo:
torchrun --nproc_per_node 1 --standalone viser_demo.py --config-name runs/pvsm_finetuned_small
The demo will start a web server. Open your browser and navigate to the displayed URL to interact with the model.
System Requirements:
- Small model: ~2.5GB VRAM
- Full model: ~3.0GB VRAM
Note: The rendering quality in gsplat is compressed.
Running Inference
To run inference on a dataset:
python inference.py --config-name runs/pvsm_finetuned_small
Or for the full model:
python inference.py --config-name runs/pvsm_finetuned_full
Training
To train the model:
torchrun --nproc_per_node <num_gpus> train.py --config-name runs/pvsm_finetuned_small
Configuration:
- Training configurations are located in
configs/runs/ - Model configurations are in
configs/model/ - Dataset configurations are in
configs/dataset/
API Keys:
Before training, create configs/api_keys.yaml with your WandB API key:
wandb: YOUR_WANDB_KEY
You can use configs/api_keys_example.yaml as a template.
Citation
If you find this work useful in your research, please consider citing:
@article{wu_pvsm_2026,
title={From Rays to Projections: Better Inputs for Feed-Forward View Synthesis},
author={Wu, Zirui and Jiang, Zeren and Oswald, Martin R. and Song, Jie},
journal={arxiv preprint arxiv:2601.05116},
year={2026}
}
License
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. See LICENSE.md for details.
Acknowledgement
This work is built upon LVSM's code base.
Related Skills
node-connect
340.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
340.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.2kCommit, push, and open a PR
