SkillAgentSearch skills...

Torchmetrics

Machine learning metrics for distributed, scalable PyTorch applications.

Install / Use

/learn @Lightning-AI/Torchmetrics
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<div align="center"> <img src="docs/source/_static/images/logo.png" width="400px">

Machine learning metrics for distributed, scalable PyTorch applications.


<p align="center"> <a href="#what-is-torchmetrics">What is Torchmetrics</a> • <a href="#implementing-your-own-module-metric">Implementing a metric</a> • <a href="#build-in-metrics">Built-in metrics</a> • <a href="https://lightning.ai/docs/torchmetrics/stable/">Docs</a> • <a href="#community">Community</a> • <a href="#license">License</a> </p>

PyPI - Python Version PyPI Status PyPI - Downloads Conda license

CI testing | CPU Build Status codecov pre-commit.ci status

Documentation Status Discord DOI JOSS status


</div>

Looking for GPUs?

Over 340,000 developers use Lightning Cloud - purpose-built for PyTorch and PyTorch Lightning.

Installation

Simple installation from PyPI

pip install torchmetrics
<details> <summary>Other installations</summary>

Install using conda

conda install -c conda-forge torchmetrics

Install using uv

uv add torchmetrics

Pip from source

# with git
pip install git+https://github.com/Lightning-AI/torchmetrics.git@release/stable

Pip from archive

pip install https://github.com/Lightning-AI/torchmetrics/archive/refs/heads/release/stable.zip

Extra dependencies for specialized metrics:

pip install torchmetrics[audio]
pip install torchmetrics[image]
pip install torchmetrics[text]
pip install torchmetrics[all]  # install all of the above

Install latest developer version

pip install https://github.com/Lightning-AI/torchmetrics/archive/master.zip
</details>

What is TorchMetrics

TorchMetrics is a collection of 100+ PyTorch metrics implementations and an easy-to-use API to create custom metrics. It offers:

  • A standardized interface to increase reproducibility
  • Reduces boilerplate
  • Automatic accumulation over batches
  • Metrics optimized for distributed-training
  • Automatic synchronization between multiple devices

You can use TorchMetrics with any PyTorch model or with PyTorch Lightning to enjoy additional features such as:

  • Module metrics are automatically placed on the correct device.
  • Native support for logging metrics in Lightning to reduce even more boilerplate.

Using TorchMetrics

Module metrics

The module-based metrics contain internal metric states (similar to the parameters of the PyTorch module) that automate accumulation and synchronization across devices!

  • Automatic accumulation over multiple batches
  • Automatic synchronization between multiple devices
  • Metric arithmetic

This can be run on CPU, single GPU or multi-GPUs!

For the single GPU/CPU case:

import torch

# import our library
import torchmetrics

# initialize metric
metric = torchmetrics.classification.Accuracy(task="multiclass", num_classes=5)

# move the metric to device you want computations to take place
device = "cuda" if torch.cuda.is_available() else "cpu"
metric.to(device)

n_batches = 10
for i in range(n_batches):
    # simulate a classification problem
    preds = torch.randn(10, 5).softmax(dim=-1).to(device)
    target = torch.randint(5, (10,)).to(device)

    # metric on current batch
    acc = metric(preds, target)
    print(f"Accuracy on batch {i}: {acc}")

# metric on all batches using custom accumulation
acc = metric.compute()
print(f"Accuracy on all data: {acc}")

Module metric usage remains the same when using multiple GPUs or multiple nodes.

<details> <summary>Example using DDP</summary> <!--phmdoctest-mark.skip-->
import os
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
from torch import nn
from torch.nn.parallel import DistributedDataParallel as DDP
import torchmetrics


def metric_ddp(rank, world_size):
    os.environ["MASTER_ADDR"] = "localhost"
    os.environ["MASTER_PORT"] = "12355"

    # create default process group
    dist.init_process_group("gloo", rank=rank, world_size=world_size)

    # initialize model
    metric = torchmetrics.classification.Accuracy(task="multiclass", num_classes=5)

    # define a model and append your metric to it
    # this allows metric states to be placed on correct accelerators when
    # .to(device) is called on the model
    model = nn.Linear(10, 10)
    model.metric = metric
    model = model.to(rank)

    # initialize DDP
    model = DDP(model, device_ids=[rank])

    n_epochs = 5
    # this shows iteration over multiple training epochs
    for n in range(n_epochs):
        # this will be replaced by a DataLoader with a DistributedSampler
        n_batches = 10
        for i in range(n_batches):
            # simulate a classification problem
            preds = torch.randn(10, 5).softmax(dim=-1)
            target = torch.randint(5, (10,))

            # metric on current batch
            acc = metric(preds, target)
            if rank == 0:  # print only for rank 0
                print(f"Accuracy on batch {i}: {acc}")

        # metric on all batches and all accelerators using custom accumulation
        # accuracy is same across both accelerators
        acc = metric.compute()
        print(f"Accuracy on all data: {acc}, accelerator rank: {rank}")

        # Resetting internal state such that metric ready for new data
        metric.reset()

    # cleanup
    dist.destroy_process_group()


if __name__ == "__main__":
    world_size = 2  # number of gpus to parallelize over
    mp.spawn(metric_ddp, args=(world_size,), nprocs=world_size, join=True)
</details>

Implementing your own Module metric

Implementing your own metric is as easy as subclassing an torch.nn.Module. Simply, subclass torchmetrics.Metric and just implement the update and compute methods:

import torch
from torchmetrics import Metric


class MyAccuracy(Metric):
    def __init__(self):
        # remember to call super
        super().__init__()
        # call `self.add_state`for every internal state that is needed for the metrics computations
        # dist_reduce_fx indicates the function that should be used to reduce
        # state from multiple processes
        self.add_state("correct", default=torch.tensor(0), dist_reduce_fx="sum")
        self.add_state("total", default=torch.tensor(0), dist_reduce_fx="sum")

    def update(self, preds: torch.Tensor, target: torch.Tensor) -> None:
        # extract predicted class index for computing accuracy
        preds = preds.argmax(dim=-1)
        assert preds.shape == target.shape
        # update metric states
        self.correct += torch.sum(preds == target)
        self.total += target.numel()

    def compute(self) -> torch.Tensor:
        # compute final result
        return self.correct.float() / se

Related Skills

View on GitHub
GitHub Stars2.4k
CategoryData
Updated20h ago
Forks480

Languages

Python

Security Score

100/100

Audited on Mar 28, 2026

No findings