<img src="images/logo.png" width="360"/> Welcome to use HugNLP. 🤗 Hugging for NLP! <div align="center">

[中文]

</div>

About HugNLP

HugNLP is a novel development and application library based on Hugging Face for improving the convenience and effectiveness of NLP researchers.

**News & Highlights

🆕 [23-10-25]: Our HugNLP paper has been awarded CIKM 2023 Best Demo Paper!
🆕 [23-08-06]: Our HugNLP paper has been accepted by CIKM 2023 (Demo Track)!
🆕 [23-05-05]: HugNLP is released at @HugAILab !
🆕 [23-04-06]: Develop a small ChatGPT-like assistance, naming HugChat! You can chat with HugNLP! [see doc]
🆕 [23-04-02]: Add GPT-style instruction-tuning. You can continual train a small-scale ChatGPT! [see doc]
🆕 [23-03-21]: Finish GPT-style in-context learning for sequence classification. [see doc]
🆕 [23-03-13]: Add code clone detection and defect task. You can train clone and defect for user-defined dataset. [see doc]
🆕 [23-03-03]: Add HugIE API and corresponding training script. You can use it to perform information extraction on Chinese data. [see doc]
🆕 [23-02-18]: The HugNLP is started.

Architecture

The framework overview is shown as follows:

Models

In HugNLP, we provide some popular transformer-based models as backbones, such as BERT, RoBERTa, GPT-2, etc. We also release our pre-built KP-PLM, a novel knowledge-enhanced pre-training paradigm to inject factual knowledge and can be easily used for arbitrary PLMs. Apart from basic PLMs, we also implement some task-specific models, involving sequence classification, matching, labeling, span extraction, multi-choice, and text generation. Notably, we develop standard fine-tuning (based on CLS Head and prompt-tuning models that enable PLM tuning on classification tasks. For few-shot learning settings, HugNLP provides a prototypical network in both few-shot text classification and named entity recognition (NER).

In addition, we also incorporate some plug-and-play utils in HugNLP.

Parameter Freezing. If we want to perform parameter-efficient learning, which aims to freeze some parameters in PLMs to improve the training efficiency, we can set the configure use_freezing and freeze the backbone. A use case is shown in Code.
Uncertainty Estimation aims to calculate the model certainty when in semi-supervised learning.
We also design Prediction Calibration, which can be used to further improve the accuracy by calibrating the distribution and alleviating the semantics bias problem.

Processors

Processors aim to load the dataset and process the task examples in a pipeline containing sentence tokenization, sampling, and tensor generation. Specifically, users can directly obtain the data through load_dataset, which can directly download it from the Internet or load it from the local disk. For different tasks, users should define a task-specific data collator, which aims to transform the original examples into model input tensor features.

Applications

It provides rich modules for users to build real-world applications and products by selecting among an array of settings from Models and Processors.

Core Capacities

We provide some core capacities to support the NLP downstream applications.

Knowledge-enhanced Pre-trained Language Model

Conventional pre-training methods lack factual knowledge. To deal with this issue, we present KP-PLM with a novel knowledge prompting paradigm for knowledge-enhanced pre-training.

Specifically, we construct a knowledge sub-graph for each input text by recognizing entities and aligning with the knowledge base and decompose this sub-graph into multiple relation paths, which can be directly transformed into language prompts.

Prompt-based Fine-tuning

Prompt-based fine-tuning aims to reuse the pre-training objective (e.g., Masked Language Modeling, Causal Language Modeling) and utilizes a well-designed template and verbalizer to make predictions, which has achieved great success in low-resource settings.

We integrate some novel approaches into HugNLP, such as PET, P-tuning, etc.

Instruction Tuning & In-Context Learning

Instruction-tuning and in-context learning enable few/zero-shot learning without parameter update, which aims to concatenate the task-aware instructions or example-based demonstrations to prompt GPT-style PLMs to generate reliable responses. So, all the NLP tasks can be unified into the same format and can substantially improve the models" generalization.

Inspired by this idea, we extend it into other two paradigms:

extractive-style paradigm: we unify various NLP tasks into span extraction, which is the same as extractive question answering.
inference-style paradigm: all the tasks can be viewed as natural language inference to match the relations between inputs and outputs.
generative-style paradigm: we unify all the tasks into generative format, and train the causal models based on instruction-tuning, in-context learning or chain-of-thought.

Self-training with Uncertainty Estimation

Self-training can address the labeled data scarcity issue by leveraging the large-scale unlabeled data in addition to labeled data, which is one of the mature paradigms in semi-supervised learning. However, the standard self-training may generate too much noise, inevitably degrading the model performance due to confirmation bias.

Thus, we present uncertainty-aware self-training. Specifically, we train a teacher model on few-shot labeled data, and then use Monte Carlo (MC) dropout technique in Bayesian neural network (BNN) to approximate the model certainty, and judiciously select the examples that have a higher model certainty of the teacher.

Parameter-Efficient Learning

To improve the training efficiency of HugNLP, we also implement parameter-efficient learning, which aims to freeze some parameters in the backbone so that we only tune a few parameters during model training. We develop some novel parameter-efficient learning approaches, such as Prefix-tuning, Adapter-tuning, BitFit and LoRA, etc.

Installation

$ git clone https://github.com/HugAILab/HugNLP.git
$ cd HugNLP
$ python3 setup.py install

At present, the project is still being developed and improved, and there may be some bugs in use, please understand. We also look forward to your being able to ask issues or committing some valuable pull requests.

Pre-built Applications Overview

We demonstrate all pre-built applications in HugNLP. You can choose one application to use HugNLP. You can also click the link to see the details document.

| Applications | Runing Tasks | Task Notes | PLM Models | Documents | | -------------------------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- | ----------------------------------------------------------------------------- | | Default Application | run_seq_cls.sh | Goal: Standard Fine-tuning or Prompt-tuning for sequence classification on user-defined dataset.   Path: applications/default_applications | BERT, RoBERTa, DeBERTa | click | | | run_seq_labeling.sh | Goal: Standard Fine-tuning for sequence labeling on user-defined dataset.   Path: applications/default_applications | BERT, RoBERTa, ALBERT | | | Pre-training | run_pretrain_mlm.sh | Goal: Pre-training via Masked Language Modeling (MLM).   Path: applications/pretraining/ | BERT, RoBERTa | [click](./documents/pretraining

HugNLP

Install / Use

README