SPIdepth
[CVPR 2025] Strengthened Pose Information for self-supervised monocular depth estimation. SPIdepth refines the pose network to improve depth prediction accuracy, achieving state-of-the-art results on benchmarks like KITTI, Cityscapes, and Make3D.
Install / Use
/learn @Lavreniuk/SPIdepthREADME
[CVPR 2025] SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth Estimation
</a> <a href='https://arxiv.org/abs/2404.12501'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
Training
To train on KITTI, run:
python train.py ./args_files/hisfog/kitti/cvnXt_H_320x1024.txt
For instructions on downloading the KITTI dataset, see Monodepth2
To finetune on KITTI, run:
python ./finetune/train_ft_SQLdepth.py ./conf/cvnXt.txt ./finetune/txt_args/train/inc_kitti.txt
To train on CityScapes, run:
python train.py ./args_files/args_cityscapes_train.txt
To finetune on CityScapes, run:
python train.py ./args_files/args_cityscapes_finetune.txt
For preparing cityscapes dataset, please refer to SfMLearner's prepare_train_data.py script. We used the following command:
python prepare_train_data.py \
--img_height 512 \
--img_width 1024 \
--dataset_dir <path_to_downloaded_cityscapes_data> \
--dataset_name cityscapes \
--dump_root <your_preprocessed_cityscapes_path> \
--seq_length 3 \
--num_threads 8
Pretrained weights and evaluation
You can download weights for some pretrained models here:
To evaluate a model on KITTI, run:
python evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_H_320x1024.txt
Make sure you have first run export_gt_depth.py to extract ground truth files.
And to evaluate a model on Cityscapes, run:
python ./tools/evaluate_depth_cityscapes_config.py args_files/args_cvnXt_H_cityscapes_finetune_eval.txt
The ground truth depth files can be found at HERE,
Download this and unzip into splits/cityscapes.
Inference with your own images
python test_simple_SQL_config.py ./conf/cvnXt.txt
In ./conf/cvnXt.txt, you can set --image_path to a single image or a directory of images.
Citation
If you find this project useful for your research, please consider citing:
@InProceedings{Lavreniuk_2025_CVPR,
author = {Lavreniuk, Mykola and Lavreniuk, Alla},
title = {SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth Estimation},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops},
month = {June},
year = {2025},
pages = {874-884}
}
Acknowledgement
This project is built on top of SQLdepth, and we are grateful for their outstanding contributions.
