DADApy
Distance-based Analysis of DAta-manifolds in python
Install / Use
/learn @sissa-data-science/DADApyREADME
DADApy is a Python package for the characterization of manifolds in high-dimensional spaces.
Homepage
For more details and tutorials, visit the homepage at: https://dadapy.readthedocs.io/
Quick Example
import numpy as np
from dadapy.data import Data
# Generate a simple 3D gaussian dataset
X = np.random.normal(0, 1, (1000, 3))
# initialize the "Data" class with the set of coordinates
data = Data(X)
# compute distances up to the 100th nearest neighbor
data.compute_distances(maxk=100)
# compute the intrinsic dimension using 2nn estimator
id, id_error, id_distance = data.compute_id_2NN()
# compute the intrinsic dimension up to the 64th nearest neighbors using Gride
id_list, id_error_list, id_distance_list = data.return_id_scaling_gride(range_max=64)
# compute the density using PAk, a point adaptive kNN estimator
log_den, log_den_error = data.compute_density_PAk()
# find the peaks of the density profile through the ADP algorithm
cluster_assignment = data.compute_clustering_ADP()
# compute the neighborhood overlap with another dataset
X2 = np.random.normal(0, 1, (1000, 5))
overlap_x2 = data.return_data_overlap(X2)
# compute the neighborhood overlap with a set of labels
labels = np.repeat(np.arange(10), 100)
overlap_labels = data.return_label_overlap(labels)
Currently implemented algorithms
-
Intrinsic dimension estimators
-
Two-NN estimator
Facco et al., Scientific Reports (2017)
-
Gride estimator
Denti et al., Scientific Reports (2022)
-
I3D estimator (for both continuous and discrete spaces)
Macocco et al., Physical Review Letters (2023)
-
BID estimator
Acevedo et al., Nature Communications Physics (2025)
-
Density estimators
-
kNN estimator
-
k*NN estimator (kNN with an adaptive choice of k)
-
PAk estimator
Rodriguez et al., JCTC (2018)
-
point-adaptive mean-shift gradient estimator
Carli et al., ArXiv (2024)
-
BMTI estimator
Carli et al., ArXiv (2024)
-
Density peaks clustering methods
-
Density peaks clustering
Rodriguez and Laio, Science (2014)
-
Advanced density peaks clustering
d’Errico et al., Information Sciences (2021)
-
k-peak clustering
Sormani, Rodriguez and Laio, JCTC (2020)
-
Manifold comparison tools
-
Neighbourhood overlap
Doimo et al., NeurIPS (2020)
-
Information imbalance
Glielmo et al., PNAS Nexus (2022)
-
Feature selection and weighting tool
-
Differentiable Information Imbalance
Wild et al., Nature Communications (2025)
-
Causal analysis tools
-
Imbalance Gain
Del Tatto et al., PNAS (2024)
-
Community causal graph
Allione et al., arXiv (2025)
Installation
The package is compatible with the Python versions 3.8, 3.9, 3.10, 3.11, and 3.12.
The methods of the classes DiffImbalance and CausalGraph are only compatible with Python>=3.9.
We currently only support Unix-based systems, including Linux and macOS.
For Windows machines, we suggest using the Windows Subsystem for Linux (WSL).
The package requires numpy, scipy, scikit-learn, jax, jaxlib and matplotlib for the visualizations.
The package contains Cython-generated C extensions that are automatically compiled during installation.
The latest release is available through pip:
pip install dadapy
To install the latest development version, clone the source code from GitHub and install it with pip as follows:
pip install git+https://github.com/sissa-data-science/DADApy
Alternatively, if you'd like to modify the implementation of some function locally you can download the repository and install the package with:
git clone https://github.com/sissa-data-science/DADApy.git
cd DADApy
python setup.py build_ext --inplace
pip install .
The methods of the classes DiffImbalance and CausalGraph can be run on GPU, using a suitable installation of JAX on a GPU platform. The code has been tested using JAX v0.4.30 with CUDA 12, which can be installed with:
pip install --upgrade "jax[cuda12_pip]==0.4.30" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
For more information on the installation of the JAX library on GPUs see the official repository.
Citing DADApy
A description of the package is available here.
Please consider citing it if you found this package useful for your research:
@article{dadapy,
title = {DADApy: Distance-based analysis of data-manifolds in Python},
journal = {Patterns},
pages = {100589},
year = {2022},
issn = {2666-3899},
doi = {https://doi.org/10.1016/j.patter.2022.100589},
url = {https://www.sciencedirect.com/science/article/pii/S2666389922002070},
author = {Aldo Glielmo and Iuri Macocco and Diego Doimo and Matteo Carli and Claudio Zeni and Romina Wild and Maria d’Errico and Alex Rodriguez and Alessandro Laio},
}
Related Skills
claude-opus-4-5-migration
110.4kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
model-usage
350.8kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
feishu-drive
350.8k|
things-mac
350.8kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
