FIGA
[ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"
Install / Use
/learn @RUCAIBox/FIGAREADME
FIGA
This repository is the official implementation of ICLR 2024 paper: Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment.
Quick Start
Considering that a modified version of transformers will be installed, it is recommended to create a new conda environment:
conda create -n FIGA python=3.8
conda activate FIGA
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
You should clone the FIGA repository and follow its instructions.
git clone https://github.com/RUCAIBox/FIGA.git && cd FIGA
pip install -r requirements.txt
After this, you need to replace the trainer_utils.py and modeling_llama.py files in the transformers library with the corresponding files from this repository. This is necessary for fine-tuning using the FIGA method.
SPA Dataset
You can download SPA dataset in: https://huggingface.co/datasets/RUCAIBox/SPA.
For our publicly available SPA dataset, the output field is the ground truth response, the original_output field contains results generated by the alpaca-7b model, and the revised_output field contains results modified by using a more powerful model (i.e. ChatGPT-3.5). For a detailed description of the construction process of the SPA dataset, please refer to our paper.
Instruction tuning
After setting up the environment, you can utilize the FIGA method to fine-tune the model by referring to the code provided below:
bash bash/run_7b.sh > output.log 2>&1
Acknowledgment
Please cite the following paper if you find our code or data helpful.
@article{guo2023beyond,
title={Beyond imitation: Leveraging fine-grained quality signals for alignment},
author={Guo, Geyang and Zhao, Ranchi and Tang, Tianyi and Zhao, Wayne Xin and Wen, Ji-Rong},
journal={arXiv preprint arXiv:2311.04072},
year={2023}
}
Related Skills
node-connect
349.7kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.7kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.7kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
