TEXTOIR
TEXTOIR is the first opensource toolkit for text open intent recognition. (ACL 2021)
Install / Use
/learn @thuiar/TEXTOIRREADME
TEXT Open Intent Recognition (TEXTOIR)
TEXTOIR is the first high-quality Text Open Intent Recognition platform. This repo contains a convenient toolkit with extensible interfaces, integrating a series of state-of-the-art algorithms of two tasks (open intent detection and open intent discovery). We also release the pipeline framework and the visualized platform in the repo TEXTOIR-DEMO.
Introduction
TEXTOIR aims to provide a convenience toolkit for researchers to reproduce the related text open classification and clustering methods. It contains two tasks, which are defined as open intent detection and open intent discovery. Open intent detection aims to identify n-class known intents, and detect one-class open intent. Open intent discovery aims to leverage limited prior knowledge of known intents to find fine-grained known and open intent-wise clusters. Related papers and codes are collected in our previous released reading list.
Open Intent Recognition:

Updates 🔥 🔥 🔥
| Date | Announcements | |- |- | | 12/2023 | 🎆 🎆 New paper and SOTA in Open Intent Discovery. Refer to the directory USNID for the codes. Read the paper -- A Clustering Framework for Unsupervised and Semi-supervised New Intent Discovery (Published in IEEE TKDE 2023). | | 04/2023 | 🎆 🎆 New paper and SOTA in Open Intent Detection. Refer to the directory DA-ADB for the codes. Read the paper -- Learning Discriminative Representations and Decision Boundaries for Open Intent Detection (Published in IEEE/ACM TASLP 2023). | | 09/2021 | 🎆 🎆 The first integrated and visualized platform for text Open Intent Recognition TEXTOIR has been released. Refer to the directory TEXTOIR-DEMO for the demo codes. Read our paper TEXTOIR: An Integrated and Visualized Platform for Text Open Intent Recognition (Published in ACL 2021). | | 05/2021 | New paper and baselines DeepAligned in Open Intent Discovery have been released. Read our paper Discovering New Intents with Deep Aligned Clustering (Published in AAAI 2021). | | 05/2021 | New paper and baselines ADB in Open Intent Detection have been released. Read our paper Deep Open Intent Classification with Adaptive Decision Boundary (Published in AAAI 2021). | | 05/2020 | New paper and baselines CDAC+ in Open Intent Discovery have been released. Read our paper Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement (Published in AAAI 2020). | | 07/2019 | New paper and baselines DeepUNK in Open Intent Detection have been released. Read our paper Deep Unknown Intent Detection with Margin Loss (Published in ACL 2019). |
We strongly recommend you to use our TEXTOIR toolkit, which has standard and unified interfaces (especially data setting) to obtain fair and persuable results on benchmark intent datasets!
Benchmark Datasets
| Datasets | Source | | :---: | :---: | | BANKING | Paper | | OOS / CLINC150 | Paper | | StackOverflow | Paper |
Integrated Models
Open Intent Detection
| Model Name | Source | Published | | :---: | :---: | :---: | | OpenMax* | Paper Code | CVPR 2016 | | MSP | Paper Code | ICLR 2017 | | DOC | Paper Code | EMNLP 2017 | | DeepUnk | Paper Code | ACL 2019 | | SEG | Paper Code | ACL 2020 | | ADB | Paper Code | AAAI 2021 | | (K+1)-way | Paper Code | ACL 2021 | | MDF | Paper Code | ACL 2021 | | ARPL* | Paper Code | IEEE TPAMI 2022 | | KNNCL | Paper Code | ACL 2022 | | DA-ADB | Paper Code | IEEE/ACM TASLP 2023 |
New Intent Discovery
| Setting | Model Name | Source | Published | | :---: | :---: | :---: | :---: | | Unsupervised | KM | Paper | BSMSP 1967 | | Unsupervised | AG | Paper | PR 1978 | | Unsupervised | SAE-KM | Paper | JMLR 2010| | Unsupervised | DEC | Paper Code | ICML 2016 | | Unsupervised | DCN | Paper Code | ICML 2017 | | Unsupervised | CC | Paper Code | AAAI 2021 | | Unsupervised | SCCL | Paper Code | NAACL 2021 | | Unsupervised | USNID | Paper Code | IEEE TKDE 2023 | | Semi-supervised | KCL* | Paper Code | ICLR 2018 | | Semi-supervised | MCL* | Paper Code | ICLR 2019 | | Semi-supervised | DTC* | Paper Code | ICCV 2019 | | Semi-supervised | CDAC+ | Paper Code | AAAI 2020 | | Semi-supervised | DeepAligned | Paper Code | AAAI 2021 | | Semi-supervised | GCD | Paper Code | CVPR 2022 | | Semi-supervised | MTP-CLNN | Paper Code | ACL 2022 | | Semi-supervised | USNID | Paper Code | IEEE TKDE 2023 |
(* denotes the CV model replaced with the BERT backbone)
Quick Start
- Use anaconda to create Python (version >= 3.6) environment
conda create --name textoir python=3.6
conda activate textoir
- Install PyTorch (Cuda version 11.2)
conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch -c conda-forge
- Clone the TEXTOIR repository, and choose the task (Take open intent detection as an example).
git clone git@github.com:thuiar/TEXTOIR.git
cd TEXTOIR
cd open_intent_detection
- Install related environmental dependencies
pip install -r requirements.txt
- Run examples (Take ADB as an example)
sh examples/run_ADB.sh
- Note that if you cannot download the pre-trained model directly from HuggingFace transformers, you need to download it yourself. We provide the pre-trained bert model in the following link:
Baidu Cloud Drive with code: v8tk
Extensibility
This toolkit is extensible and supports adding new methods, datasets, configurations, backbones, dataloaders, losses convenien
Related Skills
node-connect
349.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
