SkillAgentSearch skills...

Mpsfm

MP-SfM: Monocular Surface Priors for Robust Structure-from-Motion (CVPR 2025)

Install / Use

/learn @cvg/Mpsfm
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <h1 align="center"><ins>MP-SfM</ins> 🏗️<br>Monocular Surface Priors for Robust Structure-from-Motion</h1> <p align="center"> <a href="https://www.linkedin.com/in/zador-pataki-a297b5196/">Zador&nbsp;Pataki</a> · <a href="https://psarlin.com/">Paul-Edouard&nbsp;Sarlin</a> · <a href="https://demuc.de/">Johannes&nbsp;Schönberger</a> · <a href="https://www.microsoft.com/en-us/research/people/mapoll/">Marc&nbsp;Pollefeys</a> </p> <h2 align="center"> <p>CVPR 2025</p> <a href="https://arxiv.org/pdf/2504.20040" align="center">Paper</a> | <!-- | --> <!-- <a href="missing" align="center">Demo 🤗</a> | --> <!-- <a href="missing" align="center">Colab</a> | --> <a href="https://www.youtube.com/watch?v=Kl4l5fXBUkM&ab_channel=ZadorPataki" align="center">Video</a> </h2> <!-- </p> <p align="center"> <a href=""><img src="assets/teaser.png" alt="example" width=75%></a> <br> <em> </em> </p> --> <p align="center"> <a href=""><img src="assets/rec.gif" alt="example" width=100%></a> <br> <em> MP-SfM augments Structure-from-Motion with monocular depth and normal priors for reliable 3D reconstruction despite extreme viewpoint changes with little visual overlap. </em> </p>

MP-SfM is a Structure-from-Motion pipeline that integrates monocular depth and normal predictions into classical multi-view reconstruction. This hybrid approach improves robustness in difficult scenarios such as low parallax, high symmetry, and sparse viewpoints, while maintaining strong performance in standard conditions. This repository includes code, pretrained models, and instructions for reproducing our results.

Quick Start

Setup

<!-- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](missing) -->

We provide the Python package mpsfm. First clone the repository and install the dependencies.

git clone --recursive https://github.com/cvg/mpsfm && cd mpsfm

Build pyceres and pycolmap (from our fork) from source, then install the required packages:

pip install -r requirements.txt
python -m pip install -e .
<details> <summary><b>[Optional - click to expand]</b></summary>
  • For faster inference with the transformer-based models, install xformers

  • For faster inference with the MASt3R matcher, compile the cuda kernels for RoPE as recommended by the authors:

    DIR=$PWD
    cd third_party/mast3r/dust3r/croco/models/curope/
    python setup.py build_ext --inplace
    cd $DIR
    
</details> <details> <summary><b>[Docker - click to expand]</b></summary>

After cloning the repository with --recursive, you can pull the Docker image with all system dependencies preinstalled:

docker pull mpsfm/mpsfm:latest

To test the Docker image, run:

docker run --gpus all -it --rm \
  --shm-size=8g \
  -v $(pwd):/mpsfm \
  -w /mpsfm mpsfm/mpsfm:latest
  • --shm-size=8g: avoid PyTorch DataLoader crashes
  • -v $(pwd):/mpsfm: mount your local directory
  • -w /mpsfm: set working directory inside container

Inside the container, finish by installing the Python package:

pip install -e .

Finally, run the following optional steps. Note: ml-depth-pro was omitted from the requirements.txt file during the Docker build.

# optional MASt3R speed up
DIR=$PWD
cd third_party/mast3r/dust3r/croco/models/curope/
python setup.py build_ext --inplace
cd $DIR

# optional depthpro install 
cd third_party/ml-depth-pro/
pip install -e . --no-deps
cd $DIR
</details>

Execution

Our demo notebook demonstrates a minimal usage example. It shows how to run the MP-SfM pipeline, and how to visualize the reconstruction with its multiple output modalities.

<p align="center"> <a href=""><img src="assets/demo.gif" alt="example" width=70%></a> <br> <em> Visualizing MP-SfM sparse and dense reconstruction outputs in the demo. </em> </p>

Alternatively, run the reconstruction from the command line:

# Use default ⚙️
python reconstruct.py \
    --conf sp-lg_m3dv2 \ # see config dir "configs" for other curated options
    --data_dir local/example \ # hosts sfm inputs and outputs when other options aren't specified 
    --intrinsics_path local/example/intrinsics.yaml \ # path to the intrinsics file 
    --images_dir local/example/images \ # images directory
    --cache_dir local/example/cache_dir \ # extraction outputs: depths, normals, matches, etc.
    --extract \ # use ["sky", "features", "matches", "depth", "normals"] to force re-extract
    --verbose 0 

# Or simply run this and let argparse take care of the default inputs
python reconstruct.py 

The script will reconstruct the scene in local/example, and output the reconstruction into local/example/sfm_outputs.

  • Extraction: Some configurations only cache a subset of prior outputs, for example only normals of Metric3Dv2. Re-extract using --extract when later using a prior pipeline that requires all outputs.

  • Verbosity: Change the verbosity level of the pipeline using --verbose. 0 provides clean output. 1 offers minimal debugging output, including function benchmarking and a 3D visualization (3d.html) saved in your --data_dir at the end of the process. 2 saves a visualization after every 5 registered images, pauses the pipeline, and provides additional debugging outputs. 3 provides full debugging outputs.

<details> <summary><b>[Run with your own data - click to expand]</b></summary>

Check out our example data directory.

  • Images: Add your images to a single folder. Add them to a folder called "images" in the --data_dir, or point to it via --images_dir

  • Camera Intrinsics: Create a single.yaml file storing all camera intrinsics. Place it in your --data_dir and call it intrinsics.yaml or point to it via --intrinsics_path. Follow the structure presented in intrinsics.yaml, or see the description below:

    <details> <summary><b>[Intrinsics file example - click to expand]</b></summary>

    Single Camera:

    # .yaml setup when images have shared intrinsics
    1:
      params: [604.32447211, 604.666982, 696.5, 396.5] # fx, fy, cx, cy
      images: all
      # or specify the images belonging to this camera
      # images :
      #   - indoor_DSC03018.JPG
      #   - indoor_DSC03200.JPG
      #   - indoor_DSC03081.JPG
      #   - indoor_DSC03194.JPG
      #   - indoor_DSC03127.JPG
      #   - indoor_DSC03131.JPG
      #   - indoor_DSC03218.JPG
    

    Multiple cameras:

    # .yaml setup when images have different intrinsics
    # camera 1
    1:
      params: [fx1, fy1, cx1, cy1]
      images:
        - im11.jpg
        - im12.jpg
        ...
    # camera 2
    2:
      params: [fx2, fy2, cx2, cy2]
      images:
        - im21.jpg
        - im22.jpg
        ...
    
    </details>
</details>

Configuration

<p align="center"> <a href=""><img src="assets/pipeline.png" alt="example" width=100%></a> <br> <em> We extend COLMAP’s incremental mapping pipeline with monocular priors for which we provide easily adjustable hyperparameters via configs. </em> </p>

We have fine-grained control over all hyperparameters via OmegaConf configurations, which have sensible default values defined in MpsfmMapper.default_conf. Run this python script to display a human-readable overview of all possible adjustable parameters. Note: We import all default COLMAP hyperparameters, but only use a subset.

from mpsfm.sfm.mapper import MpsfmMapper
from mpsfm.utils.tools import summarize_cfg

print(summarize_cfg(MpsfmMapper.default_conf))

See our configuration directory for all of our carefully selected configuration setups. Each .yaml file overwrites default configurations, with the exception of the empty default setup sp-lg_m3dv2. Additionally, other configuration setups can be imported using defaults: (see example). This is important because the hyperparameters in some configuration setups (see defaults) were carefully grouped.

Here, we provide an example configuration file detailing all of the important configurations.

<details> <summary><b>[Click to expand]</b></summary>
# Untested config created to demonstrate how to write config files

# import default configs to make sure depth estimators are used with correct uncertainties
defaults: 
  - defaults/depthpro # in this example we use depthpro

reconstruction:
  image:
    depth:
      depth_uncertainty: 0.2 # we can override the default uncertainty in defaults/depthpro.yaml (not recommended)
    normals: 
      flip_consistency: true # use flip consistency check for normals (see defaults in mpsfm/sfm/scene/image/normals.py)

extractors:
  # use dsine normals instead of metric3dv2 (default set in mpsfm/extraction/base.py)
  # use "-fc" variant because we need flipped estimates for the "flip_consistency" check
  normals: DSINE-kappa-fc 
  matcher: roma_outdoor #change matcher
# for dense matchers we can use any c
View on GitHub
GitHub Stars497
CategoryDevelopment
Updated1d ago
Forks35

Languages

Python

Security Score

95/100

Audited on Mar 27, 2026

No findings