Distancia

The DistanceMetrics package is a comprehensive Python library designed to compute a wide variety of distance metrics between two vectors, set, matrix or sequences. This package includes implementations of several well-known distance metrics, each providing a unique measure of dissimilarity or similarity between data points.

Generate Convert Improve

Install / Use

/learn @ym001/Distancia

About this skill

Quality Score

0/100

README

.. meta:: :description: Distancia is a comprehensive Python package that provides a wide range of distance metrics and similarity measures, making it easy to calculate and compare the proximity between various types of data. This documentation provides an in-depth guide to the package, including installation instructions, usage examples, and detailed descriptions of each available metric.

:keywords: data science machine learning deep-learRandomWalkning neural-network graph text-classification text distance cython markov-chain file similarity image classification nlp machine learning loss functions distancia :keywords lang=en: machine learning, image processing, optimization,text similarity, NLP, search engine, document ranking

====================================== Welcome to Distancia's documentation!

Distancia is a comprehensive Python package that provides a wide range of distance metrics and similarity measures, making it easy to calculate and compare the proximity between various types of data. This documentation provides an in-depth guide to the package, including installation instructions, usage examples, and detailed descriptions of each available metric.

The documentation is divided into the following sections:

.. note::

The code examples provided in this documentation are written for Python 3.x. The python code in this package has been optimized by static typing with Cython

Getting Started

Distancia is designed to be simple and intuitive, yet powerful and flexible. Whether you are working with numerical data, strings, or other types of data, Distancia provides the tools you need to measure the distance or similarity between objects.

For a quick introduction, check out the quickstart_ guide. If you want to dive straight into the code, head over to the Euclidean_ page.

.. quickstart: https://distancia.readthedocs.io/en/latest/quickstart.html

.. _Euclidean: https://distancia.readthedocs.io/en/latest/Euclidean.html

.. note::

If you find any issues or have suggestions for improvements, feel free to contribute!

Installation

You can install the distancia package with pip:

.. code-block:: bash

pip install distancia

By default, this will install the core functionality of the package, suitable for users who only need basic distance metrics.

Optional Dependencies The Distancia package also supports optional modules to enable additional features. You can install these extras depending on your needs:

With pandas support: Install with additional support for working with tabular data:

.. code-block:: bash

pip install distancia[pandas]

With all supported extras: Install all optional dependencies for maximum functionality:

.. code-block:: bash

pip install distancia[all]

This modular installation allows you to keep your setup lightweight or include everything for full capabilities.

Quickstart

Here are some common examples of how to use Distancia:

.. code-block:: python

from distancia import Euclidean

point1 = [1, 2, 3] point2 = [4, 5, 6]

Create an instance of Euclidean

euclidean = Euclidean()

Calculate the Euclidean distance

distance = euclidean.compute(point1, point2)

print(f"Euclidean Distance: {distance:4f}")

.. code-block:: bash

Euclidean Distance: 5.196

.. code-block:: python

from distancia import Levenshtein

string1 = "kitten" string2 = "sitting"

distance = Levenshtein().compute(string1, string2) print(f"Levenshtein Distance: {distance:4f}")

.. code:: bash

Levenshtein Distance: 3

For a complete list and detailed explanations of each metric, see the next section.

Available measurement type

.. _Vector Distance Measures: https://distancia.readthedocs.io/en/latest/vectorDistance.html .. _Matrix Distance Measures: https://distancia.readthedocs.io/en/latest/matrixDistance.html .. _Text Distance Measures: https://distancia.readthedocs.io/en/latest/textDistance.html .. _Time Series Distance Measures: https://distancia.readthedocs.io/en/latest/timeDistance.html .. _Loss Function-Based Distance Measures: https://distancia.readthedocs.io/en/latest/lossFunction.html .. _Graph Distance Measures: https://distancia.readthedocs.io/en/latest/graphDistance.html .. _Markov Chain Distance Measures: https://distancia.readthedocs.io/en/latest/markovChainDistance.html .. _Image Distance Measures: https://distancia.readthedocs.io/en/latest/imageDistance.html .. _Audio Distance Measures: https://distancia.readthedocs.io/en/latest/soundDistance.html .. _File Distance Measures: https://distancia.readthedocs.io/en/latest/fileDistance.html

`Vector Distance Measures`_

Distance measures between vectors are essential in machine learning, classification, and information retrieval. Here are five of the most commonly used:

Euclidean Distance_

The Euclidean distance is the square root of the sum of the squared differences between the coordinates of two vectors. It is ideal for measuring similarity in geometric spaces.

.. _Euclidean Distance: https://distancia.readthedocs.io/en/latest/Euclidean.html

Manhattan Distance_
Also known as L1 distance, it is defined as the sum of the absolute differences between the coordinates of the vectors. It is well-suited for discrete spaces and grid-based environments.

.. _Manhattan Distance: https://distancia.readthedocs.io/en/latest/Manhattan.html

Cosine Distance_
It measures the angle between two vectors rather than their absolute distance. Commonly used in natural language processing and information retrieval (e.g., search engines).

.. _Cosine Distance: https://distancia.readthedocs.io/en/latest/Cosine.html

Jaccard Distance_
Based on the ratio of the intersection to the union of sets, it is effective for comparing sets of words, tags, or recommended items.

.. _Jaccard Distance: https://distancia.readthedocs.io/en/latest/Jaccard.html

Hamming Distance_
It counts the number of differing positions between two character or binary sequences. It is widely used in error detection and bioinformatics.

.. _Hamming Distance: https://distancia.readthedocs.io/en/latest/Hamming.html

.. note::
These distance measures are widely used in various algorithms, including clustering, supervised classification, and search engines.

`Matrix Distance Measures`_

Distance measures between matrices are widely used in machine learning, image processing, and numerical analysis. Below are five of the most commonly used:

Frobenius Norm_ The Frobenius norm is the square root of the sum of the squared elements of the difference between two matrices. It generalizes the Euclidean distance to matrices and is commonly used in optimization problems.

.. _Frobenius Norm: https://distancia.readthedocs.io/en/latest/Frobenius.html

Spectral Norm_ Defined as the largest singular value of the difference between two matrices, the spectral norm is useful for analyzing stability in numerical methods.

.. _Spectral Norm: https://distancia.readthedocs.io/en/latest/SpectralNormDistance.html

Trace Norm (Nuclear Norm)_ This norm is the sum of the singular values of the difference between matrices. It is often used in low-rank approximation and compressed sensing.

.. _Trace Norm (Nuclear Norm): https://distancia.readthedocs.io/en/latest/NuclearNorm.html

Mahalanobis Distance_ A statistical distance measure that considers correlations between features, making it effective in multivariate anomaly detection and classification.

.. _Mahalanobis Distance: https://distancia.readthedocs.io/en/latest/Mahalanobis.html

Wasserstein Distance (Earth Mover’s Distance)_ This metric quantifies the optimal transport cost between two probability distributions, making it highly relevant in image processing and deep learning.

.. _Wasserstein Distance (Earth Mover’s Distance): https://distancia.readthedocs.io/en/latest/Wasserstein.html

.. note::
These distance measures are widely applied in fields such as computer vision, data clustering, and signal processing.

`Text Distance Measures`_

Distance measures between texts are crucial in natural language processing (NLP), search engines, and text similarity tasks. Below are five of the most commonly used:

Levenshtein Distance (Edit Distance)_ The minimum number of single-character edits (insertions, deletions, or substitutions) required to transform one string into another. Used in spell checkers and DNA sequence analysis.

.. _Levenshtein Distance (Edit Distance): https://distancia.readthedocs.io/en/latest/Levenshtein.html

Jaccard Similarity_
Measures the overlap between two sets of words or character n-grams, computed as the ratio of their intersection to their union. Useful in document comparison and keyword matching.

.. _Jaccard Similarity: https://distancia.readthedocs.io/en/latest/Jaccard.html

Cosine Similarity_
Computes the cosine of the angle between two text vectors, often based on TF-IDF or word embeddings. Commonly used in search engines and document ranking.

.. _Cosine Similarity: https://distancia.readthedocs.io/en/latest/Cosine.html

Damerau-Levenshtein Distance_ An extension of Levenshtein distance that also considers transpositions (swapping adjacent characters). More robust for typographical error detection.

.. _Damerau-Levenshtein Distance: https://distancia.readthedocs.io/en/latest/DamerauLevenshtein.html

BLEU Score (Bilingual Evaluation Understudy)_ Measures the similarity between a candidate text and reference texts using n-gram precision. Widely used in machine translation and text summarization.

.. _BLEU Score (Bilingual E

Related Skills

feishu-drive

343.1k

things-mac

343.1k

Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)

clawhub

343.1k

Use the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com

codebase-memory-mcp

1.1k

High-performance code intelligence MCP server. Indexes codebases into a persistent knowledge graph — average repo in milliseconds. 66 languages, sub-ms queries, 99% fewer tokens. Single static binary, zero dependencies.

ym001

View profile

View on GitHub

GitHub Stars15

CategoryData

Updated1mo ago

Forks2

ym001/distancia

Languages

Jupyter Notebook

Security Score

80/100

Audited on Feb 19, 2026

No findings

Distancia

Install / Use

README

====================================== Welcome to Distancia's documentation!

Getting Started

Installation

Quickstart

Create an instance of Euclidean

Calculate the Euclidean distance

Available measurement type

Vector Distance Measures_

Matrix Distance Measures_

Text Distance Measures_

Related Skills

`Vector Distance Measures`_

`Matrix Distance Measures`_

`Text Distance Measures`_