FIGA

[ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"

Generate Convert Improve

Install / Use

/learn @RUCAIBox/FIGA

About this skill

Quality Score

0/100

README

FIGA

This repository is the official implementation of ICLR 2024 paper: Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment.

Quick Start

Considering that a modified version of transformers will be installed, it is recommended to create a new conda environment:

conda create -n FIGA python=3.8
conda activate FIGA
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia

You should clone the FIGA repository and follow its instructions.

git clone https://github.com/RUCAIBox/FIGA.git && cd FIGA
pip install -r requirements.txt

After this, you need to replace the trainer_utils.py and modeling_llama.py files in the transformers library with the corresponding files from this repository. This is necessary for fine-tuning using the FIGA method.

SPA Dataset

You can download SPA dataset in: https://huggingface.co/datasets/RUCAIBox/SPA.

For our publicly available SPA dataset, the output field is the ground truth response, the original_output field contains results generated by the alpaca-7b model, and the revised_output field contains results modified by using a more powerful model (i.e. ChatGPT-3.5). For a detailed description of the construction process of the SPA dataset, please refer to our paper.

Instruction tuning

After setting up the environment, you can utilize the FIGA method to fine-tune the model by referring to the code provided below:

bash bash/run_7b.sh > output.log 2>&1

Acknowledgment

Please cite the following paper if you find our code or data helpful.

@article{guo2023beyond,
  title={Beyond imitation: Leveraging fine-grained quality signals for alignment},
  author={Guo, Geyang and Zhao, Ranchi and Tang, Tianyi and Zhao, Wayne Xin and Wen, Ji-Rong},
  journal={arXiv preprint arXiv:2311.04072},
  year={2023}
}

Related Skills

node-connect

349.7k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.7k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

349.7k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

349.7k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。