SkillAgentSearch skills...

Less3Depend

[ICLR 2026] PyTorch implementation of "The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge".

Install / Use

/learn @ou524u/Less3Depend
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Less3Depend (ICLR 2026)

This repository contains the PyTorch implementation of the paper "The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge".

<div> <a href="https://pku-vcl-geometry.github.io/Less3Depend/"><strong>Project Page</strong></a> | <a href="https://arxiv.org/abs/2506.09885"><strong>Paper</strong></a> </div>

1. Preparation

Environment Setup

Create and activate conda environment:

conda create -n lvsm python=3.11
conda activate lvsm
pip install -r requirements.txt

Recommended: GPU device with compute capability > 8.0. We used 8*A100 GPUs in our experiments.

Dataset Setup

Update(26/01/04): We now also provide our preprocessed version of DL3DV dataset in pixelSplat-style format on huggingface!

We use RealEstate10K dataset from pixelSplat, and followed LVSM to do the preprocessing.

Download and unzip RealEstate10K .torch chunks. For our scaling experiments, we split the dataset into 4 sizes, each containing the number of chunks listed below:

| Size | Chunks | Scenes | |------|--------|--------| | Little | 76 | 1,202 | | Medium | 304 | 4,121 | | Large | 1,216 | 16,449 | | Full | 4,866 | 66,033 |

Process the dataset following LVSM:

# process training split
python process_data.py --base_path datasets/re10k --output_dir datasets/re10k-full_processed --mode train --num_processes 80

# process test split
python process_data.py --base_path datasets/re10k --output_dir datasets/re10k-full_processed --mode test --num_processes 80

2. Evaluation

<!-- Download pre-trained model from [Google Drive](https://drive.google.com/file/d/1PMEl0RoOwi2wlsMRv6K9YSfeq-KqVYbz/view?usp=sharing), or with the following command: ```bash # download pre-trained model mkdir -p checkpoints/uplvsm gdown 1PMEl0RoOwi2wlsMRv6K9YSfeq-KqVYbz -O checkpoints/uplvsm/uplvsm_x224.pt ``` -->

Download pre-trained model from Hugging Face.

Run evaluation:

# fast inference, compute metrics only
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.inference_fast --config config/eval/uplvsm_x224.yaml

# complete inference
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.inference --config config/eval/uplvsm_x224.yaml
<!-- ✅ Download uplvsm model with 518×518 resolution from [Google Drive](https://drive.google.com/file/d/1DiLCEzHbxtusvA6ic6IhpYuhD93PUjJw/view?usp=sharing), and run evaluation: -->

✅ Download uplvsm model with 518×518 resolution from Hugging Face, and run evaluation:

# fast inference, compute metrics only
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.inference_fast --config config/eval/uplvsm_x518.yaml

# complete inference
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.inference --config config/eval/uplvsm_x518.yaml

3. Training

# pretraining on 224×224 resolution
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.train --config config/uplvsm_x224.yaml

# finetuning on 518×518 resolution
torchrun --nproc_per_node 8 --nnodes 1 --rdzv_id 18640 --rdzv_backend c10d --rdzv_endpoint localhost:29511 -m src.train --config config/uplvsm_x518.yaml

📄 Acknowledgments

Our implementation builds upon LVSM. We also recommend RayZer, Pensieve and X-Factor for self-supervised scene reconstruction.

If you find this work useful for your research, please consider citing:

@misc{wang2025less3depend,
    title={The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge}, 
    author={Haoru Wang and Kai Ye and Yangyan Li and Wenzheng Chen and Baoquan Chen},
    year={2025},
    eprint={2506.09885},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2506.09885}, 
}

Related Skills

View on GitHub
GitHub Stars54
CategoryDevelopment
Updated3d ago
Forks5

Languages

Python

Security Score

80/100

Audited on Apr 6, 2026

No findings