SwinDepth
"SwinDepth: Unsupervised Depth Estimation using Monocular Sequences via Swin Transformer and Densely Cascaded Network" (ICRA 2023)
Install / Use
/learn @dsshim0125/SwinDepthREADME
SwinDepth
This is the PyTorch implementation of the paper "SwinDepth: Unsupervised Depth Estimation using Monocular Sequences via Swin Transformer and Densely Cascaded Network" (ICRA 2023). [paper]
We provide pre-trained weights and evaluation codes for a simple visualization of depth estimation results on KITTI dataset.
Download the pre-trained weights from here and place them in "./checkpoints/best" folder
Setup
conda create -n ht_dcmnet python=3.8.5
conda activate ht_dcmnet
conda install pytorch torchvision cudatoolkit=11.1 -c pytorch -c nvidia
pip install -r requirements.txt
Our experiments has been done with PyTorch 1.9.0, CUDA 11.2, Python 3.8.5 and Ubuntu 18.04. We use 4 NVIDIA RTX 3090 GPUs for training, but you can still run our code with GPUs which have smaller memory by reducing the batch_size. A simpel visualziation can be done by GPUs with 3GB of memory use or CPU only is also functional.
Simple Prediction
You can simply visualize the depth estimation results on some images from KITTI with:
python test_simple.py --image_path=./test_images/
You can check depth estimation results with other images from KITTI or your own datasets by adding test images on the folder named "test_images". You can run the code without GPU by using --no_cuda flag.
KITTI Dataset
You can download the entire raw KITTI dataset by running:
wget -i splits/kitti_archives_to_download.txt -P /YOUR/DATA/PATH/
KITTI images are converted from .png to .jpg extension with this command for fast load times during training:
find /YOUR/DATA/PATH/ -name '*.png' | parallel 'convert -quality 92 -sampling-factor 2x2,1x1,1x1 {.}.png {.}.jpg && rm {}'
The commands above results in the data_path:
/YOUR/DATA/PATH
|----2011_09_26
|----2011_09_26_drive_0001_sync
|-----.......
|----image_02
|-----data
|-----0000000000.jpg
|-----.......
|-----timestamps.txt
|-----.......
|----.........
|----2011_09_28
|----.........
Training
For training, you have to pre-train Swin Transformer encoder in ImageNet-1k dataset.
You can either simply download ImageNet-pretrained encoder weight here named '104checkpoint.pth' or train Swin Transformer yourself with PyTorch offical code. Then, you place the pretrained weight in ./checkpoints/imagenet folder.
The depth estimation network is trained by running:
python train.py --data_path=/YOUR/DATA/PATH --log_dir=./checkpoints --model_name=ht_dcmnet --num_epochs=40 --batch_size=12
Evaluation
Before evaluation, you should prepare ground truth depth maps by running:
python export_gt_depth.py --data_path /YOUR/DATA/PATH --split eigen
The following example command evaluates best weights:
python evaluate_depth.py --data_path=/YOUR/DATA/PATH --load_weights_folder ./checkpoints/best/
Reference
- Monodepth2 - https://github.com/nianticlabs/monodepth2
- timm - https://github.com/rwightman/pytorch-image-models
- mmsegmentation - https://github.com/open-mmlab/mmsegmentation
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
openclaw-plugin-loom
Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
