Seegull

SeeGULL is a broad-coverage stereotype dataset in English containing stereotypes about identity groups spanning 178 countries across 8 different geo-political regions across 6 continents, as well as state-level identities within the US and India.

Generate Convert Improve

Install / Use

/learn @google-research-datasets/Seegull

About this skill

Quality Score

0/100

README

SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models

This repository contains data resources for the paper "SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models". This dataset contains stereotype examples that may be offensive.

Overview

Stereotype benchmark datasets are crucial to detect and mitigate social stereotypes about groups of people in NLP models. However, existing datasets are limited in size and coverage, and are largely restricted to stereotypes prevalent in the Western society. This is especially problematic as language technologies gain hold across the globe. To address this gap, we present SeeGULL, a broad-coverage stereotype dataset, built by utilizing generative capabilities of large language models such as PaLM, and GPT-3, and leveraging a globally diverse rater pool to validate the prevalence of those stereotypes in society. SeeGULL is in English, and contains stereotypes about identity groups spanning 178 countries across 8 different geo-political regions across 6 continents, as well as state-level identities within the US and India. We also include fine-grained offensiveness scores for different stereotypes and demonstrate their global disparities. Furthermore, we include comparative annotations about the same groups by annotators living in the region vs. those that are based in North America, and demonstrate that within-region stereotypes about groups differ from those prevalent in North America.

Dataset Description

The repo contains the data card for the SeeGULL dataset, following the format proposed by Pushkarna et al.. The data card includes details of the dataset such as intended usage, field names and meanings, annotator recruitment and payments. The dataset folder contains the following 3 files:

stereotypes_global.csv: Nationality based stereotypes
stereotypes_indian_states.csv: Stereotypes about Indian States
stereotypes_us_states.csv: Stereotypes about US States

For nationality based stereotypes (stereotypes_global.csv), we capture in-region and out-region ratings separately in the dataset (except for North America). We had 2 groups of annotators:

Stereotype Ratings within Region: We recruited annotators from 16 countries across 8 cultural regions to annotate stereotypes about regional identities from corresponding regions (e.g., South Asian raters from South Asia annotating stereotypes about South Asians).
Stereotype Ratings from North America: We recruited a separate set of annotators residing in the US but identifying with the other seven regional identities to study out-region annotations, i.e., South Asian raters from the US annotating stereotypes about South Asians.

Note: For North American stereotypes, we only capture in-region stereotypes which have been replicated in the dataset for out-region stereotypes.

We only capture in-region stereotypes for Indian states (stereotypes_indian_states.csv) and the US states (stereotypes_us_states.csv).

All three files contain the mean offensiveness scores. Higher the score, more offensive the stereotype.

Changes in SeeGULL V2

The stereotypes_global.csv dataset has been updated. We made the following changes:

Removed noisy attributes
Converted all attributes to lowercase
Lemmatization mapped attributes to a single root word/phrase
Removed stop words and mapped attributes to a single word/phrase
Removed punctuations ('-', '_') from attributes
Corrected typos in some identities

Citation

@inproceedings{jha-etal-2023-seegull,
    title = "{S}ee{GULL}: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models",
    author = "Jha, Akshita  and
      Mostafazadeh Davani, Aida  and
      Reddy, Chandan K  and
      Dave, Shachi  and
      Prabhakaran, Vinodkumar  and
      Dev, Sunipa",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-long.548",
    pages = "9851--9870",
}

Related Skills

node-connect

344.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

96.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。