S2m2
Official implementation of "S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation, ICCV 2025"
Install / Use
/learn @junhong-3dv/S2m2README
🤗 Notice
This repository and its contents are not related to any official Samsung Electronics products.
All resources are provided solely for non-commercial research and education purposes.
✨ Key Features
🧩 Model
- Scalable stereo matching architecture
- State-of-the-art performance on ETH3D (1st), Middlebury V3 (1st), and Booster (1st)
- Joint estimation of disparity, occlusion, and confidence
- Supports negative disparity estimation
- Optimal under the pinhole camera model with ideal stereo rectification (vertical disparity < 2px)
⚙️ Code
- ✅ FP16 / FP32 inference
- ✅ TorchScript/ONNX/TensorRT export
- ❌ Training pipeline (not included)
Note: The publicly released model weights differ from the version used for Middlebury and ETH but are identical to the one used in Booster benchmark. This implementation replaces the dynamic attention-based refinement module with an UNet for stable ONNX export. It also includes an additional M variant and extended training data with transparent objects.
🚀 Performance
Detailed benchmark results and visualizations are available on the Project Page.
Inference Speed (FPS)** on **NVIDIA RTX 5090 (float16 + refine_iter=3):
🔧 Installation
We recommend using Python 3.10, PyTorch 2.9, CUDA 12.9, CUDNN 9.1.0, and tensorRT 10.13.3 with Anaconda.
git clone https://github.com/junhong-3dv/s2m2
cd s2m2
conda env create -n s2m2 -f environment.yml
conda activate s2m2
pip install -e .
If the environment setup via .yml doesn’t work smoothly, you can manually install the main dependencies with:
pip install torch torchvision opencv-python open3d onnx onnxruntime-gpu onnxscript tensorrt-cu12==10.13.3.9 --extra-index-url https://pypi.nvidia.com/tensorrt-cu12-libs
That should cover most of the required packages for running the demo.
🚀 Pre-trained Models and Inference
1. Download Pre-trained Models
Create a directory for weights and download the desired models from the links below.
mkdir weights
mkdir weights/pretrain_weights
| Model | Download | Model Size | | :---: | :---: | :--: | | S | Download | 26.5M | | M | Download | 80.4M | | L | Download | 181M | | XL| Download | 406M |
2. Run Basic Demo
To generate a result for a single input, run demo/visualize_2d_simple.py.
python ./demo/visualize_2d_simple.py --model_type XL --num_refine 3
| arg | default | type | help | | :---: | :---: | :--: | :--: | | --model_type | 'XL' | str | select model type: [S,M,L,XL] | | --num_refine | 3 | int | number of local iterative refinement | | --torch_compile | False | set_true | apply torch_compile | | --allow_negative | False | set_true | allow negative disparity |
3. Run 3D Visualization Demo
To visualize the 3D output interactively, run demo/visualize_3d_booster.py or demo/visualize_3d_middlebury.py
python ./demo/visualize_3d_booster.py --model_type L
For 'visualize_3d_middlebury.py --model_type XL ', result should be like below.
<p align="center"> <img src="fig/result_bicycle2.png" width="90%"> </p> If you failed to reproduce this, let me know.🚀 Model Optimization (ONNX / TensorRT)
1. Export to ONNX (OPSET=18)
Use export_onnx.py to convert the model to ONNX:
python demo/export_onnx.py --model_type $MODEL_TYPE --img_width $IMG_WIDHT --img_height $IMG_HEIGHT
2. Export to TensorRT (tested with 10.13.3)
Use export_tensorrt.py to build a TensorRT engine:
python demo/export_tensorrt.py --model_type $MODEL_TYPE --img_width $IMG_WIDHT --img_height $IMG_HEIGHT --precision $PRECISION
or simply in the terminal
trtexec --onnx={onnx_file_path} --saveEngine=./{trt_file_path} --fp16 --precisionConstraints=obey --layerPrecisions=node_linalg_vector_norm_2:fp32
Supported TensorRT precisions: fp32, tf32, fp16
📜 Citation
If you find our work useful for your research, please consider citing our paper:
@inproceedings{min2025s2m2,
title={{S\textsuperscript{2}M\textsuperscript{2}}: Scalable Stereo Matching Model for Reliable Depth Estimation},
author={Junhong Min and Youngpil Jeon and Jimin Kim and Minyong Choi},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2025}
}
