Annif
Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
Install / Use
/learn @NatLibFi/AnnifREADME
Annif is an automated subject indexing toolkit. It was originally created as a statistical automated indexing tool that used metadata from the Finna.fi discovery interface as a training corpus.
Annif provides CLI commands for administration, and a REST API and web UI for end-users.
Finto AI is a service based on Annif; see a 🤗 Hugging Face Hub collection of the models that Finto AI uses.
This repository contains a rewritten production version of Annif based on the prototype.
Basic install
Annif is developed and tested on Linux. If you want to run Annif on Windows or Mac OS, the recommended way is to use Docker (see below) or a Linux virtual machine.
You will need Python 3.10-3.13 to install Annif.
The recommended way is to install Annif from PyPI into a virtual environment.
python3 -m venv annif-venv
source annif-venv/bin/activate
pip install annif
Start up the application:
annif
See Getting Started for basic usage instructions and Optional features and dependencies for installation instructions for e.g. fastText and Omikuji backends and for Voikko and spaCy analyzers.
Shell compeletions
Annif supports tab-key completion in bash, zsh and fish shells for commands and options and project id, vocabulary id and path parameters. The completion functionality is not enabled after Annif installation; get instructions for how to enable it by running
annif completion --help
or see this wiki page.
Docker install
You can use Annif as a pre-built Docker container image from quay.io/natlibfi/annif repository. Please see the wiki documentation for details.
Demo install in Codespaces
Annif can be tried out in the GitHub Codespaces. Just open a page for configuring a new codespace via the badge below, start the codespace from the green "Create codespace" button, and a terminal session will start in your browser. The environment will have Annif installed and the contents of the Annif-tutorial repository available.
Development install
A development version of Annif can be installed by cloning the GitHub repository. uv is used for managing dependencies and virtual environment for the development version.
See CONTRIBUTING.md for information on unit tests, code style, development flow etc. details that are useful when participating in Annif development.
Installation and setup
Clone the repository.
Switch into the repository directory.
Install pipx and uv if you don't have them. First pipx:
python3 -m pip install --user pipx
python3 -m pipx ensurepath
Open a new shell, and then install uv:
pipx install uv
uv can be installed also without pipx: check the uv documentation.
Create a virtual environment and install dependencies:
uv sync
By default development dependencies are included. Use option --extra to install dependencies for selected optional features (--extra extra1 --extra extra2 for multiple extras), or install all of them with --all-extras. By default the virtual environment directory is .venv under the project directory.
You can run Annif in one of two ways:
1. One-off using uv run
uv run annif
2. Activating the virtual environment
Enter the virtual environment:
source .venv/bin/activate
Start up the application:
annif
Getting help
Many resources are available:
- Usage documentation in the wiki
- Annif tutorial for learning to use Annif
- annif-users discussion forum; please use this as a channel for questions instead of personal e-mails to developers
- Internal API documentation on ReadTheDocs
- annif.org project web site
Publications / How to cite
See below for some articles about Annif in peer-reviewed Open Access journals. The software itself is also archived on Zenodo and has a citable DOI.
Citing the software itself
See "Cite this repository" in the details of the repository.
Annif articles
<ul> <li> Suominen, O; Inkinen, J.; Lehtinen, M. 2025. Annif at the GermEval-2025 LLMs4Subjects Task: Traditional XMTC Augmented by Efficient LLMs, pre-print. https://arxiv.org/abs/2508.15877 <details> <summary>See BibTex</summary>@misc{suominen2025annifgermeval2025,
title={https://arxiv.org/abs/2508.15877},
author={Osma Suominen and Juho Inkinen and Mona Lehtinen},
year={2025},
eprint={2508.15877},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.15877},
}
</details>
</li>
<li>
Suominen, O; Inkinen, J.; Lehtinen, M. 2025.
Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs.
In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pp. 2424–2431, Vienna, Austria. Association for Computational Linguistics.
https://aclanthology.org/2025.semeval-1.315/
https://arxiv.org/abs/2504.19675
<details>
<summary>See BibTex</summary>
@misc{suominen2025annifsemeval2025task5,
title={Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs},
title = "Annif at {S}em{E}val-2025 Task 5: Traditional {XMTC} augmented by {LLM}s",
author = "Suominen, Osma and Inkinen, Juho and Lehtinen, Mona",
editor = "Rosenthal, Sara and Ros{\'a}, Aiala and Ghosh, Debanjan and Zampieri, Marcos",
booktitle = "Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.semeval-1.315/",
pages = "2424--2431",
ISBN = "979-8-89176-273-2",
# ArXiv
# year={2025},
# eprint={2504.19675},
# archivePrefix={arXiv},
# primaryClass={cs.CL},
# url={https://arxiv.org/abs/2504.19675},
}
</details>
</li>
<li>
Inkinen, J.; Lehtinen, M.; Suominen, O., 2025.
Annif Users Survey: Understanding Usage and Challenges.
URL:
https://urn.fi/URN:ISBN:978-952-84-1301-1
<details>
<summary>See BibTex</summary>
@misc{inkinen2025,
title={Annif Users Survey: Understanding Usage and Challenges},
author={Inkinen, Juho and Lehtinen, Mona and Suominen, Osma},
series={The National Library of Finland. Reports and Studies},
issn={2242–8119},
isbn={978-952-84-1301-1},
year={2025},
url={URN:ISBN:978-952-84-1301-1},
}
</details>
</li>
<li>
Golub, K.; Suominen, O.; Mohammed, A.; Aagaard, H.; Osterman, O., 2024.
Automated Dewey Decimal Classification of Swedish library metadata using Annif software.
Journal of Documentation, 80(5), pp. 1057-1079. URL:
https://doi.org/10.1108/JD-01-2022-0026
<details>
<summary>See BibTex</summary>
@article{golub2024annif,
title={Automated Dewey Decimal Classification of Swedish library metadata using Annif software},
author={Golub, Koraljka and Suominen, Osma and Mohammed, Ahmed Taiye and Aagaard,

