SkillAgentSearch skills...

Pyslam

pySLAM is a hybrid Python/C++ Visual SLAM pipeline supporting monocular, stereo, and RGB-D cameras. It provides a broad set of modern local and global feature extractors, multiple loop-closure strategies, a volumetric reconstruction module, integrated depth-prediction models, and semantic segmentation capabilities for enhanced scene understanding.

Install / Use

/learn @luigifreda/Pyslam

README

<p align="center"><img src="./images/pyslam-logo.png" height="160"></p>

pySLAM v2.10.4

Author: Luigi Freda

pySLAM is a hybrid python/C++ implementation of a Visual SLAM pipeline (Simultaneous Localization And Mapping) that supports monocular, stereo and RGBD cameras. It provides the following features in a single python environment:

  • A wide range of classical and modern local features with a convenient interface for their integration.
  • Multiple loop closing methods, including descriptor aggregators such as visual Bag of Words (BoW, iBow), Vector of Locally Aggregated Descriptors (VLAD) and modern global descriptors (image-wise descriptors such as SAD, NetVLAD, HDC-Delf, CosPlace, EigenPlaces, Megaloc).
  • A volumetric reconstruction pipeline that processes depth and color images using volumetric integration to produce dense reconstructions. It supports different voxel grid models (with semantic support) and TSDF with voxel hashing, and incremental Gaussian Splatting.
  • Integration of depth prediction models within the SLAM pipeline. These include DepthPro, DepthAnythingV2, DepthAnythingV3, RAFT-Stereo, CREStereo, etc.
  • A suite of segmentation models for semantic understanding of the scene, such as DeepLabv3, Segformer, CLIP, DETIC, EOV-SEG, ODISE, RFDETR, YOLO, etc.
  • Additional tools for VO (Visual Odometry) and SLAM, with built-in support for both g2o and GTSAM, along with custom Python bindings for features not available in the original libraries.
  • A modular sparse-SLAM core, implemented in both Python and C++ (with custom pybind11 bindings), allowing users to switch between high-performance/speed and high-flexibility modes. The Python and C++ implementations are interoperable: maps saved by one can be loaded by the other. Further details here.
  • A modular pipeline for end-to-end inference of 3D scenes from multiple images. Supports models like DUSt3R, Mast3r, MV-DUSt3R, VGGT, Robust VGGT, DepthFromAnythingV3, and Fast3R. Further details here.
  • Built-in support for over 10 dataset types.

pySLAM serves as a flexible baseline framework to experiment with VO/SLAM techniques, local features, descriptor aggregators, global descriptors, volumetric integration, depth prediction and semantic mapping. It allows to explore, prototype and develop VO/SLAM pipelines both in Python and C++. pySLAM is a research framework and a work in progress.

Enjoy it!

<p align="center"> <img src="./images/pyslam.gif" alt="pySLAM - Stereo mapping example" height="320"> </p> <p align="center"> <img src="./images/depth-prediction.png" alt="pySLAM - Depth prediction" height="160"> <img src="./images/dense-reconstruction-with-depth-prediction.png" alt="pySLAM - Depth prediction and 3D Reconstruction" height="160"> </p> <p align="center"> <img src="./images/semantic_mapping.png" alt="pySLAM - Semantic Mapping" height="160"> </p> <p align="center"> <img src="./images/dense-reconstruction-composition.gif" alt="pySLAM - Dense reconstruction - Gaussian Splatting" height="320"> </p>

See the demo video for release v2.10.0

<p align="center"> <a href="https://www.youtube.com/watch?v=jzwKByzyqzg" target="_blank" rel="noopener noreferrer"> <img src="https://img.youtube.com/vi/jzwKByzyqzg/0.jpg" alt="▶ Video: pySLAM demo v2.10.0" height="300"/> </a> </p>

Table of contents

<!-- TOC --> <!-- /TOC -->

Overview

├── cpp         # Pybind11 C++ bindings to slam utilities 
│   ├── hamming     # SIMD-optimized Hamming distance calculator for uint8 binary descriptors with zero-copy Python bindings.
│   ├── glutils     # OpenGL utilities for drawing points, cameras, etc.
│   ├── solvers     # PnP and Sim3 solvers for camera pose estimation 
│   ├── volumetric  # Volumetric mapping with parallel block-based voxel hashing, templates, carving, and semantics support.
│   ├── trajectory  # Trajectory alignment helpers
├── data       # Sample input/output data
├── docs       # Documentation files
├── pyslam     # Core Python package
│   ├── dense
│   ├── depth_estimation
│   ├── evaluation
│   ├── io
│   ├── local_features
│   ├── loop_closing
│   ├── scene_from_views # Unified 3D scene reconstruction from multiple views
│   ├── semantics
│       ├── cpp  # C++ core for semant
View on GitHub
GitHub Stars3.2k
CategoryCustomer
Updated6h ago
Forks512

Languages

Python

Security Score

100/100

Audited on Mar 24, 2026

No findings