Bbnli

Bias Benchmark for Natural Language Inference. Code repo for the Findings of NAACL 2022 paper "On Measuring Social Biases in Prompt-Based Multi-Task Learning".

Generate Convert Improve

Install / Use

/learn @feyzaakyurek/Bbnli

About this skill

Quality Score

0/100

README

On Measuring Social Biases in Prompt-Based Multi-Task Learning

Code repository for the following paper where we test whether zero-shot generalization capacity of T0 models into novel language tasks results in amplified social biases.

On Measuring Social Biases in Prompt-Based Multi-Task Learning, Afra Feyza Akyürek, Sejin Paik, Muhammed Yusuf Kocyigit, Seda Akbiyik, Şerife Leman Runyun, Derry Tanti Wijaya. Findings of NAACL 2022. [Link]

Installation & Running

After cloning the repository and changing directory, please consider using sh setup.sh to create a conda environment and install required packages. You can then run python call_bbnli_inference_bias.py for BBNLI and python bbq_inference_bias.py for BBQ experiments. For BBNLI, the script first converts the set of premises and hypotheses into a csv file while cross-pairing every premise with every hypothesis. It then continues to sample generations from the model using HugginFace Inference API. For this stage to work, you need to a file named hf_key under the repository containing only your API key.

There are a few flags at the beginning of each script, when overwrite_csv=False the script skips creating csv files and goes directly into generating samples. When skip_inference=True, the script doesn't generate completions (assuming you already have them in the respective csv files), instead solely computes bias scores.

BBNLI Dataset

We release a natural language inference benchmark for measuring biases covering gender, race and religion domains. You may find the dataset under data/bbnli.

Contacts

Feel free to reach out to with questions, or open issues.

Afra Feyza Akyürek (akyurek@bu.edu)

Cite

If you are using code or data from this repository, please cite the following work:

@inproceedings{akyurek2022promptbased,
  title={On Measuring Social Biases in Prompt-Based Multi-Task Learning},
  author={Aky{\"u}rek, Afra Feyza and Paik, Sejin and Kocyigit, Muhammed Yusuf and Akbiyik, Seda and Runyun, Serife Leman and Wijaya, Derry},
  booktitle={Findings of NAACL},
  year={2022}
}

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

feyzaakyurek

View profile

View on GitHub

GitHub Stars15

CategoryEducation

Updated1y ago

Forks1

feyzaakyurek/bbnli

Languages

Python

Security Score

80/100

Audited on Apr 22, 2024

No findings