GUS

A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems. EMNLP 2022 SereTOD Workshop.

Generate Convert Improve

Install / Use

/learn @thu-spmi/GUS

About this skill

Quality Score

0/100

README

GUS

This is the official code and data for paper "A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems"

Requirements

After you create an environment with python 3.6, the following commands are recommended to install required packages.

pip install torch==1.5
pip install transformers==3.5
pip install spacy==3.1
python -m spacy download en_core_web_sm
pip install sklearn
pip install tensorboard

Besides, you need to install the standard evaluation repository for corpus-based evaluation, in which we change the references in mwzeval/utils.py/load_references() to 'damd', since we adopt the same delexicalization as DAMD.

ConvLab-2 is also needed to interact with the agenda based user simulator (ABUS) on MultiWOZ.

Data Preparation

Based on the data preprocessing of DAMD, we add goal state span ('gpan') and user act span ('usr_act') for each turn, which is scripted in prepare_data.py. For convenience, we provide preprocessed training data, database files and random generated testing goals in the format of .zip file. Execute following commands to unzip them

unzip db.zip -d ./db/
cd analysis
unzip goals.zip -d ./
cd ../data/multi-woz-2.1-processed
unzip data.zip -d ./

Supervised Training

To pretrain the dialog system (DS), run

bash pretrain_ds.sh $GPU

To pretrain the user simulator (US), run

bash pretrain_us.sh $GPU

RL Training

To implement RL experiments, run

bash run_RL.sh $GPU

If necessary, you can change the settings in run_RL.sh, which is described in config.py. For instance,

Set interact_with_ABUS=True in run_RL.sh to train the DS with ABUS.
Set simple_reward=True in run_RL.sh to take Success as reward.
Set full_goal=True in run_RL.sh to replace the goal state with the full goal span in US.

Evaluation

Corpus-base Evaluation

To evaluate DSs/USs based on the annotations in corpus, run

bash test.sh $GPU $DS_PATH
bash test_us.sh $GPU $US_PATH

Interaction with GUS

To evaluate the interaction quality of DSs and GUS, run

bash eval_RL.sh $GPU $DS_PATH $US_PATH

Interaction with ABUS

To test with ABUS, run

python test_with_ABUS.py --device $DEVICE --path $DS_PATH

Note that you need to save the interaction results and evaluate them with

python analyze_result.py --result_path $RESULT_PATH

Related Skills

node-connect

349.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

349.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

349.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。