Dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.

Generate Convert Improve

Install / Use

/learn @facebookresearch/Dinov2

About this skill

Quality Score

0/100

README

:new: [2025-12-18] Added support for loading XRay-DINO backbone following Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning, more details are here

:new: [2025-12-16] Added Channel-Adaptive DINO code following Scaling Channel-Adaptive Self-Supervised Learning, more details are here

:new: [2025-12-16] Added Cell-DINO code following [Cell-DINO: Self-Supervised Image-based Embeddings for Cell Fluorescent Microscopy](to appear in Plos One Computational Biology), more details are here

[2025-08-14] Please check out the more recent DINOv3 effort continuing this line of work.

[2025-06-11] Added dino.txt inference code, following DINOv2 Meets Text: A Unified Framework for Image- and Pixel-Level Vision-Language Alignment.

[2023-10-26] Added DINOv2 backbones with registers, following Vision Transformers Need Registers.

DINOv2: Learning Robust Visual Features without Supervision

Meta AI Research, FAIR

Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy V. Vo, Marc Szafraniec, Vasil Khalidov, Patrick Labatut, Armand Joulin, Piotr Bojanowski

[Paper #1] Paper #2] [Blog] [Demo] [BibTeX]

PyTorch implementation and pretrained models for DINOv2. For details, see the papers: DINOv2: Learning Robust Visual Features without Supervision and Vision Transformers Need Registers.

DINOv2 models produce high-performance visual features that can be directly employed with classifiers as simple as linear layers on a variety of computer vision tasks; these visual features are robust and perform well across domains without any requirement for fine-tuning. The models were pretrained on a dataset of 142 M images without using any labels or annotations.

https://github.com/facebookresearch/dinov2/assets/60359573/f168823e-7922-415a-b429-578badf5c356

<div align="center"> Visualization of the three first principal components of the patch features of all frames, mapped to RGB values. </div>

Pretrained models

<table style="margin: auto"> <thead> <tr> <th>model</th> <th># of<br />params</th> <th>with<br />registers</th> <th>ImageNet<br />k-NN</th> <th>ImageNet<br />linear</th> <th>download</th> </tr> </thead> <tbody> <tr> <td>ViT-S/14 distilled</td> <td align="right">21 M</td> <td align="center">:x:</td> <td align="right">79.0%</td> <td align="right">81.1%</td> <td><a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_pretrain.pth">backbone only</a></td> </tr> <tr> <td>ViT-S/14 distilled</td> <td align="right">21 M</td> <td align="center">:white_check_mark:</td> <td align="right">79.1%</td> <td align="right">80.9%</td> <td><a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_reg4_pretrain.pth">backbone only</a></td> </tr> <tr> <td>ViT-B/14 distilled</td> <td align="right">86 M</td> <td align="center">:x:</td> <td align="right">82.1%</td> <td align="right">84.5%</td> <td><a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_pretrain.pth">backbone only</a></td> </tr> <tr> <td>ViT-B/14 distilled</td> <td align="right">86 M</td> <td align="center">:white_check_mark:</td> <td align="right">82.0%</td> <td align="right">84.6%</td> <td><a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_reg4_pretrain.pth">backbone only</a></td> </tr> <tr> <td>ViT-L/14 distilled</td> <td align="right">300 M</td> <td align="center">:x:</td> <td align="right">83.5%</td> <td align="right">86.3%</td> <td><a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_pretrain.pth">backbone only</a></td> </tr> <tr> <td>ViT-L/14 distilled</td> <td align="right">300 M</td> <td align="center">:white_check_mark:</td> <td align="right">83.8%</td> <td align="right">86.7%</td> <td><a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_reg4_pretrain.pth">backbone only</a></td> </tr> <tr> <td>ViT-g/14</td> <td align="right">1,100 M</td> <td align="center">:x:</td> <td align="right">83.5%</td> <td align="right">86.5%</td> <td><a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitg14/dinov2_vitg14_pretrain.pth">backbone only</a></td> </tr> <tr> <td>ViT-g/14</td> <td align="right">1,100 M</td> <td align="center">:white_check_mark:</td> <td align="right">83.7%</td> <td align="right">87.1%</td> <td><a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitg14/dinov2_vitg14_reg4_pretrain.pth">backbone only</a></td> </tr> </tbody> </table>

Pretrained backbones (via PyTorch Hub)

Please follow the instructions here to install PyTorch (the only required dependency for loading the model). Installing PyTorch with CUDA support is strongly recommended.

A corresponding model card is included in the repository.

import torch

# DINOv2
dinov2_vits14 = torch.hub.load('facebookresearch/dinov2', 'dinov2_vits14')
dinov2_vitb14 = torch.hub.load('facebookresearch/dinov2', 'dinov2_vitb14')
dinov2_vitl14 = torch.hub.load('facebookresearch/dinov2', 'dinov2_vitl14')
dinov2_vitg14 = torch.hub.load('facebookresearch/dinov2', 'dinov2_vitg14')

# DINOv2 with registers
dinov2_vits14_reg = torch.hub.load('facebookresearch/dinov2', 'dinov2_vits14_reg')
dinov2_vitb14_reg = torch.hub.load('facebookresearch/dinov2', 'dinov2_vitb14_reg')
dinov2_vitl14_reg = torch.hub.load('facebookresearch/dinov2', 'dinov2_vitl14_reg')
dinov2_vitg14_reg = torch.hub.load('facebookresearch/dinov2', 'dinov2_vitg14_reg')

Pretrained backbone: XRay-DINO

Request for downloading the model is here:

https://ai.meta.com/resources/models-and-libraries/raydino-downloads/

After filling the form, you will get an email with a temporary link. You can either download it using wget and indicate the checkpoint path in your local filesystem, or you can directly use the URL from the email in the following code:

import torch

REPO_DIR = <PATH/TO/A/LOCAL/DIRECTORY/WHERE/THE/DINOV2/REPO/WAS/CLONED>

xray_dino_vitl16 = torch.hub.load(REPO_DIR, 'xray_dino_vitl16', source='local', weights=<CHECKPOINT/URL/OR/PATH>)

License Model weights are released under the FAIR Noncommercial Research License. See LICENSE_XRAY_DINO_MODEL for additional details.

Pretrained heads - Image classification

<table style="margin: auto"> <thead> <tr> <th rowspan="2">backbone</th> <th rowspan="2">with<br />registers</th> <th>download</th> </tr> <tr> <th>ImageNet</th> </tr> </thead> <tbody> <tr> <td>ViT-S/14 distilled</td> <td align="center">:x:</td> <td> linear head (<a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_linear_head.pth">1 layer</a>, <a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_linear4_head.pth">4 layers</a>) </td> </tr> <tr> <td>ViT-S/14 distilled</td> <td align="center">:white_check_mark:</td> <td> linear head (<a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_reg4_linear_head.pth">1 layer</a>, <a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_reg4_linear4_head.pth">4 layers</a>) </td> </tr> <tr> <td>ViT-B/14 distilled</td> <td align="center">:x:</td> <td> linear head (<a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_linear_head.pth">1 layer</a>, <a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_linear4_head.pth">4 layers</a>) </tr> <tr> <td>ViT-B/14 distilled</td> <td align="center">:white_check_mark:</td> <td> linear head (<a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_reg4_linear_head.pth">1 layer</a>, <a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitb14/dinov2_vitb14_reg4_linear4_head.pth">4 layers</a>) </tr> <tr> <td>ViT-L/14 distilled</td> <td align="center">:x:</td> <td> linear head (<a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_linear_head.pth">1 layer</a>, <a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_linear4_head.pth">4 layers</a>) </tr> <tr> <td>ViT-L/14 distilled</td> <td align="center">:white_check_mark:</td> <td> linear head (<a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_reg4_linear_head.pth">1 layer</a>, <a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_reg4_linear4_head.pth">4 layers</a>) </tr> <tr> <td>ViT-g/14</td> <td align="center">:x:</td> <td> linear head (<a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitg14/dinov2_vitg14_linear_head.pth">1 layer</a>, <a href="https://dl.fbaipublicfiles.com/dinov2/dinov2_vitg14/dinov2_vitg14_linear4_head.pth">4 layers</a>) </tr> <tr> <td>ViT-g/14</td>

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

sec-edgar-agentkit

AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.

Kiln

4.7k

Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.