Torchmetrics
Machine learning metrics for distributed, scalable PyTorch applications.
Install / Use
/learn @Lightning-AI/TorchmetricsREADME
Machine learning metrics for distributed, scalable PyTorch applications.
<p align="center"> <a href="#what-is-torchmetrics">What is Torchmetrics</a> • <a href="#implementing-your-own-module-metric">Implementing a metric</a> • <a href="#build-in-metrics">Built-in metrics</a> • <a href="https://lightning.ai/docs/torchmetrics/stable/">Docs</a> • <a href="#community">Community</a> • <a href="#license">License</a> </p>
</div>
Looking for GPUs?
Over 340,000 developers use Lightning Cloud - purpose-built for PyTorch and PyTorch Lightning.
- GPUs from $0.19.
- Clusters: frontier-grade training/inference clusters.
- AI Studio (vibe train): workspaces where AI helps you debug, tune and vibe train.
- AI Studio (vibe deploy): workspaces where AI helps you optimize, and deploy models.
- Notebooks: Persistent GPU workspaces where AI helps you code and analyze.
- Inference: Deploy models as inference APIs.
Installation
Simple installation from PyPI
pip install torchmetrics
<details>
<summary>Other installations</summary>
Install using conda
conda install -c conda-forge torchmetrics
Install using uv
uv add torchmetrics
Pip from source
# with git
pip install git+https://github.com/Lightning-AI/torchmetrics.git@release/stable
Pip from archive
pip install https://github.com/Lightning-AI/torchmetrics/archive/refs/heads/release/stable.zip
Extra dependencies for specialized metrics:
pip install torchmetrics[audio]
pip install torchmetrics[image]
pip install torchmetrics[text]
pip install torchmetrics[all] # install all of the above
Install latest developer version
pip install https://github.com/Lightning-AI/torchmetrics/archive/master.zip
</details>
What is TorchMetrics
TorchMetrics is a collection of 100+ PyTorch metrics implementations and an easy-to-use API to create custom metrics. It offers:
- A standardized interface to increase reproducibility
- Reduces boilerplate
- Automatic accumulation over batches
- Metrics optimized for distributed-training
- Automatic synchronization between multiple devices
You can use TorchMetrics with any PyTorch model or with PyTorch Lightning to enjoy additional features such as:
- Module metrics are automatically placed on the correct device.
- Native support for logging metrics in Lightning to reduce even more boilerplate.
Using TorchMetrics
Module metrics
The module-based metrics contain internal metric states (similar to the parameters of the PyTorch module) that automate accumulation and synchronization across devices!
- Automatic accumulation over multiple batches
- Automatic synchronization between multiple devices
- Metric arithmetic
This can be run on CPU, single GPU or multi-GPUs!
For the single GPU/CPU case:
import torch
# import our library
import torchmetrics
# initialize metric
metric = torchmetrics.classification.Accuracy(task="multiclass", num_classes=5)
# move the metric to device you want computations to take place
device = "cuda" if torch.cuda.is_available() else "cpu"
metric.to(device)
n_batches = 10
for i in range(n_batches):
# simulate a classification problem
preds = torch.randn(10, 5).softmax(dim=-1).to(device)
target = torch.randint(5, (10,)).to(device)
# metric on current batch
acc = metric(preds, target)
print(f"Accuracy on batch {i}: {acc}")
# metric on all batches using custom accumulation
acc = metric.compute()
print(f"Accuracy on all data: {acc}")
Module metric usage remains the same when using multiple GPUs or multiple nodes.
<details> <summary>Example using DDP</summary> <!--phmdoctest-mark.skip-->import os
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
from torch import nn
from torch.nn.parallel import DistributedDataParallel as DDP
import torchmetrics
def metric_ddp(rank, world_size):
os.environ["MASTER_ADDR"] = "localhost"
os.environ["MASTER_PORT"] = "12355"
# create default process group
dist.init_process_group("gloo", rank=rank, world_size=world_size)
# initialize model
metric = torchmetrics.classification.Accuracy(task="multiclass", num_classes=5)
# define a model and append your metric to it
# this allows metric states to be placed on correct accelerators when
# .to(device) is called on the model
model = nn.Linear(10, 10)
model.metric = metric
model = model.to(rank)
# initialize DDP
model = DDP(model, device_ids=[rank])
n_epochs = 5
# this shows iteration over multiple training epochs
for n in range(n_epochs):
# this will be replaced by a DataLoader with a DistributedSampler
n_batches = 10
for i in range(n_batches):
# simulate a classification problem
preds = torch.randn(10, 5).softmax(dim=-1)
target = torch.randint(5, (10,))
# metric on current batch
acc = metric(preds, target)
if rank == 0: # print only for rank 0
print(f"Accuracy on batch {i}: {acc}")
# metric on all batches and all accelerators using custom accumulation
# accuracy is same across both accelerators
acc = metric.compute()
print(f"Accuracy on all data: {acc}, accelerator rank: {rank}")
# Resetting internal state such that metric ready for new data
metric.reset()
# cleanup
dist.destroy_process_group()
if __name__ == "__main__":
world_size = 2 # number of gpus to parallelize over
mp.spawn(metric_ddp, args=(world_size,), nprocs=world_size, join=True)
</details>
Implementing your own Module metric
Implementing your own metric is as easy as subclassing an torch.nn.Module. Simply, subclass torchmetrics.Metric
and just implement the update and compute methods:
import torch
from torchmetrics import Metric
class MyAccuracy(Metric):
def __init__(self):
# remember to call super
super().__init__()
# call `self.add_state`for every internal state that is needed for the metrics computations
# dist_reduce_fx indicates the function that should be used to reduce
# state from multiple processes
self.add_state("correct", default=torch.tensor(0), dist_reduce_fx="sum")
self.add_state("total", default=torch.tensor(0), dist_reduce_fx="sum")
def update(self, preds: torch.Tensor, target: torch.Tensor) -> None:
# extract predicted class index for computing accuracy
preds = preds.argmax(dim=-1)
assert preds.shape == target.shape
# update metric states
self.correct += torch.sum(preds == target)
self.total += target.numel()
def compute(self) -> torch.Tensor:
# compute final result
return self.correct.float() / se
Related Skills
claude-opus-4-5-migration
83.9kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
model-usage
339.5kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
feishu-drive
339.5k|
things-mac
339.5kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
