FARM
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Install / Use
/learn @deepset-ai/FARMREADME
.. image:: https://github.com/deepset-ai/FARM/blob/master/docs/img/farm_logo_text_right_wide.png?raw=true :width: 659 :height: 109 :align: left :alt: FARM LOGO
(F\ ramework for A\ dapting R\ epresentation M\ odels)
.. image:: https://img.shields.io/badge/docs-latest-success.svg :target: https://farm.deepset.ai/ :alt: Docs
.. image:: https://dev.azure.com/deepset/FARM/_apis/build/status/deepset-ai.FARM?branchName=master :target: https://dev.azure.com/deepset/FARM/_build :alt: Build
.. image:: https://img.shields.io/github/release/deepset-ai/farm :target: https://github.com/deepset-ai/FARM/releases :alt: Release
.. image:: https://img.shields.io/github/license/deepset-ai/farm :target: https://github.com/deepset-ai/FARM/blob/master/LICENSE :alt: License
.. image:: https://img.shields.io/github/last-commit/deepset-ai/farm :target: https://github.com/deepset-ai/FARM/commits/master :alt: Last Commit
.. image:: https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square :target: https://github.com/ambv/black :alt: Last Commit
.. image:: https://pepy.tech/badge/farm :target: https://pepy.tech/project/farm :alt: Downloads
.. image:: https://img.shields.io/badge/Jobs-We're%20hiring-blue :target: https://apply.workable.com/deepset/ :alt: Jobs
.. image:: https://img.shields.io/twitter/follow/deepset_ai?style=social :target: https://twitter.com/intent/follow?screen_name=deepset_ai :alt: Twitter
..
IMPORTANT: We migrated the core modeling parts of FARM into `Haystack <https://github.com/deepset-ai/haystack/>`_. All active development is happening there and this repo is not actively maintained anymore! Go over there to ask questions & create issues!
..
What is it?
############
FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready.
It's built upon transformers <https://github.com/huggingface/pytorch-transformers>_ and provides additional features to simplify the life of developers:
Parallelized preprocessing, highly modular design, multi-task learning, experiment tracking, easy debugging and close integration with AWS SageMaker.
With FARM you can build fast proof-of-concepts for tasks like text classification, NER or question answering and transfer them easily into production.
What is it? <https://github.com/deepset-ai/FARM#what-is-it>_Core Features <https://github.com/deepset-ai/FARM#core-features>_Resources <https://github.com/deepset-ai/FARM#resources>_Installation <https://github.com/deepset-ai/FARM#installation>_Basic Usage <https://github.com/deepset-ai/FARM#basic-usage>_Advanced Usage <https://github.com/deepset-ai/FARM#advanced-usage>_Core Concepts <https://github.com/deepset-ai/FARM#core-concepts>_FAQ <https://github.com/deepset-ai/FARM#faq>_Upcoming features <https://github.com/deepset-ai/FARM#upcoming-features>_
Core features ##############
- Easy fine-tuning of language models to your task and domain language
- Speed: AMP optimizers (~35% faster) and parallel preprocessing (16 CPU cores => ~16x faster)
- Modular design of language models and prediction heads
- Switch between heads or combine them for multitask learning
- Full Compatibility with HuggingFace Transformers' models and model hub
- Smooth upgrading to newer language models
- Integration of custom datasets via Processor class
- Powerful experiment tracking & execution
- Checkpointing & Caching to resume training and reduce costs with spot instances
- Simple deployment and visualization to showcase your model
+------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Task | BERT | RoBERTa* | XLNet | ALBERT | DistilBERT | XLMRoBERTa | ELECTRA | MiniLM | +==============================+===================+===================+===================+===================+===================+===================+===================+===================+ | Text classification | x | x | x | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | NER | x | x | x | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Question Answering | x | x | x | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Language Model Fine-tuning | x | | | | | | | | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Text Regression | x | x | x | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Multilabel Text classif. | x | x | x | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Extracting embeddings | x | x | x | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | LM from scratch | x | | | | | | | | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Text Pair Classification | x | x | x | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Passage Ranking | x | x | x | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+ | Document retrieval (DPR) | x | x | | x | x | x | x | x | +------------------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+-------------------+
* including CamemBERT and UmBERTo
Resources ########## Docs
Online documentation <https://farm.deepset.ai>_
Tutorials
- Tutorial 1 (Overview of building blocks):
Jupyter notebook 1 <https://github.com/deepset-ai/FARM/blob/master/tutorials/1_farm_building_blocks.ipynb>_ orColab 1 <https://colab.research.google.com/drive/130_7dgVC3VdLBPhiEkGULHmqSlflhmVM>_ - Tutorial 2 (How to use custom datasets):
Jupyter notebook 2 <https://github.com/deepset-ai/FARM/blob/master/tutorials/2_Build_a_processor_for_your_own_dataset.ipynb>_ orColab 2 <https://colab.research.google.com/drive/1Ce_wWu-fsy_g16jaGioe8M5mAFdLN1Yx>_ - Tutorial 3 (How to train and showcase your own QA model):
Colab 3 <https://colab.research.google.com/drive/1tqOJyMw3L5I3eXHLO846eq1fA7O9U2s8>_ - Example scripts for each task:
FARM/examples/ <https://github.com/deepset-ai/FARM/tree/master/examples>_
More
Intro to Transfer Learning (Blog) <https://medium.com/voice-tech-podcast/https-medium-com-deepset-ai-transfer-learning-entering-a-new-era-in-nlp-db523d9e667b>_Intro to Transfer Learning & FARM (Video) <https://www.youtube.com/watch?v=hoDgtvE-u9E&feature=youtu.be>_Question Answering Systems Explained (Blog) <https://medium.com/deepset-ai/modern-question-answering-systems-explained-4d0913744097>_GermanBERT (Blog) <https://deepset.ai/german-bert>_- `XLM-Roberta: The alternative for non-english NLP (Blog) <https://towardsdatascience.com/xlm-roberta-the-multilingual-alternative-for
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
last30days-skill
8.5kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
