Wefe
WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
Install / Use
/learn @dccuchile/WefeREADME
.. -- mode: rst --
|License|_ |GithubActions|_ |ReadTheDocs|_ |Downloads|_ |Pypy|_ |CondaVersion|_
.. |License| image:: https://img.shields.io/github/license/dccuchile/wefe .. _License: https://github.com/dccuchile/wefe/blob/master/LICENSE
.. |ReadTheDocs| image:: https://readthedocs.org/projects/wefe/badge/?version=latest .. _ReadTheDocs: https://wefe.readthedocs.io/en/latest/?badge=latest
.. |GithubActions| image:: https://github.com/dccuchile/wefe/actions/workflows/ci.yaml/badge.svg?branch=master .. _GithubActions: https://github.com/dccuchile/wefe/actions
.. |Downloads| image:: https://pepy.tech/badge/wefe .. _Downloads: https://pepy.tech/project/wefe
.. |Pypy| image:: https://badge.fury.io/py/wefe.svg .. _Pypy: https://pypi.org/project/wefe/
.. |CondaVersion| image:: https://anaconda.org/pbadilla/wefe/badges/version.svg .. _CondaVersion: https://anaconda.org/pbadilla/wefe
WEFE: The Word Embedding Fairness Evaluation Framework
.. image:: ./docs/logos/WEFE_2.png :width: 300 :alt: WEFE Logo :align: center
Word Embedding Fairness Evaluation (WEFE) is an open source library for measuring an mitigating bias in word embedding models. It generalizes many existing fairness metrics into a unified framework and provides a standard interface for:
- Encapsulating existing fairness metrics from previous work and designing new ones.
- Encapsulating the test words used by fairness metrics into standard objects called queries.
- Computing a fairness metric on a given pre-trained word embedding model using user-given queries.
WEFE also standardizes the process of mitigating bias through an interface similar
to the scikit-learn fit-transform.
This standardization separates the mitigation process into two stages:
- The logic of calculating the transformation to be performed on the model (
fit). - The execution of the mitigation transformation on the model (
transform).
The official documentation can be found at this link <https://wefe.readthedocs.io/>_.
Installation
WEFE requires Python 3.10 or higher. There are two different ways to install WEFE:
Install with pip (recommended)::
pip install wefe
Install with conda::
conda install -c pbadilla wefe
Install development version::
pip install git+https://github.com/dccuchile/wefe.git
Install with development dependencies::
pip install "wefe[dev]"
Install with PyTorch support::
pip install "wefe[pytorch]"
Requirements
WEFE automatically installs the following dependencies:
- gensim (>=3.8.3)
- numpy (<=1.26.4)
- pandas (>=2.0.0)
- plotly (>=6.0.0)
- requests (>=2.22.0)
- scikit-learn (>=1.5.0)
- scipy (<1.13)
- semantic_version (>=2.8.0)
- tqdm (>=4.0.0)
Contributing
To contribute to WEFE development:
-
Clone the repository::
git clone https://github.com/dccuchile/wefe cd wefe
-
Install in development mode with all dependencies::
pip install -e ".[dev]"
-
Run tests to ensure everything works::
pytest tests
-
Make your changes and run tests again
-
Follow our coding standards:
- Use
rufffor code formatting:ruff format . - Check code quality:
ruff check . - Run type checking:
mypy wefe
- Use
For detailed contributing guidelines, visit the Contributing <https://wefe.readthedocs.io/en/latest/user_guide/contribute.html>_ section in the documentation.
Development Requirements
To install WEFE with all development dependencies for testing, documentation building, and code quality tools::
pip install "wefe[dev]"
This installs additional packages including:
- pytest and pytest-cov for testing
- sphinx and related packages for documentation
- ruff for code formatting and linting
- mypy for type checking
- ipython for interactive development
Testing
All unit tests are in the tests/ folder. WEFE uses pytest as the testing framework.
To run all tests::
pytest tests
To run tests with coverage reporting::
pytest tests --cov=wefe --cov-report=html
To run a specific test file::
pytest tests/test_datasets.py
Coverage reports will be generated in htmlcov/ directory.
Build the documentation
The documentation is built using Sphinx and can be found in the docs/ folder.
To build the documentation::
cd docs
make html
Or using the development environment::
pip install "wefe[dev]"
cd docs
make html
The built documentation will be available at docs/_build/html/index.html
Changelog
Version 1.0.1
Patch Release - Documentation Updates
- Updated citation information in documentation with new JMLR 2025 publication
Version 1.0.0
Major Release - Breaking Changes
- Python 3.10+ Required: Dropped support for Python 3.6-3.9
- Modern Packaging: Migrated from
setup.pytopyproject.toml - Updated Dependencies: All packages updated for modern Python ecosystem
New Features:
- Robust dataset fetching with retry mechanism and exponential backoff
- HTTP 429 (rate limiting) and timeout error handling
- Optional dependencies:
pip install "wefe[dev]"and"wefe[pytorch]" - Dynamic version loading from
wefe.__version__
Core Improvements:
- WordEmbeddingModel: Enhanced type safety, better gensim compatibility, improved error handling
- BaseMetric: Refactored input validation, standardized
run_querymethods across all metrics - Testing: Converted to pytest patterns with monkeypatch, comprehensive test coverage
- Code Quality: Migration from flake8 to Ruff, enhanced documentation with detailed docstrings
Development Workflow:
- GitHub Actions upgraded with Python 3.10-3.13 matrix testing
- Pre-commit hooks enhanced with JSON/TOML validation and security checks
- Modernized Sphinx documentation configuration
- Updated benchmark documentation and metrics comparison tables
Version 0.4.1
- Fixed a bug where the last pair of target words in RIPA was not included.
- Added a benchmark that compares WEFE with another measurement and bias mitigation libraries in the documentation.
- Added a library changes since original paper release page in the documentation.
Version 0.4.0
- 3 new bias mitigation methods (debias) implemented: Double Hard Debias, Half Sibling Regression and Repulsion Attraction Neutralization.
- The library documentation of the library has been restructured. Now, the documentation is divided into user guide and theoretical framework The user guide does not contain theoretical information. Instead, theoretical documentation can be found in the conceptual guides.
- Improved API documentation and examples. Added multilingual examples contributed by the community.
- The user guides are fully executable because they are now on notebooks.
- There was also an important improvement in the API documentation and in metrics and debias examples.
- Improved library testing mechanisms for metrics and debias methods.
- Fixed wrong repr of query. Now the sets are in the correct order.
- Implemented repr for WordEmbeddingModel.
- Testing CI moved from CircleCI to GithubActions.
- License changed to MIT.
Version 0.3.2
- Fixed RNSB bug where the classification labels were interchanged and could produce erroneous results when the attributes are of different sizes.
- Fixed RNSB replication notebook
- Update of WEFE case study scores.
- Improved documentation examples for WEAT, RNSB, RIPA.
- Holdout parameter added to RNSB, which allows to indicate whether or not a holdout is performed when training the classifier.
- Improved printing of the RNSB evaluation.
Version 0.3.1
- Update WEFE original case study
- Hotfix: Several bug fixes for execute WEFE original Case Study.
- fetch_eds top_n_race_occupations argument set to 10.
- Preprocessing: get_embeddings_from_set now returns a list with the lost preprocessed words instead of the original ones.
Version 0.3.0
- Implemented Bolukbasi et al. 2016 Hard Debias.
- Implemented Thomas Manzini et al. 2019 Multiclass Hard Debias.
- Implemented a fetch function to retrieve gn-glove female-male word sets.
- Moved the transformation logic of words, sets and queries to embeddings to its own module: preprocessing
- Enhanced the preprocessor_args and secondary_preprocessor_args metric
preprocessing parameters to an list of preprocessors
preprocessorstogether with the parameterstrategyindicating whether to consider all the transformed words ('all') or only the first one encountered ('first'). - Renamed WordEmbeddingModel attributes
modelandmodel_nametowvandnamerespectively. - Renamed every run_query
word_embeddingargument tomodelin every metric.
Version 0.2.2
- Added RIPA metrics (thanks @stolenpyjak for your contribution!).
- Fixed Literal typing bug to make WEFE compatible with python 3.7.
Version 0.2.1
- Compatibility fixes.
Version 0.2.0
- Renamed optional
run_queryparameterwarn_filtered_wordstowarn_not_found_words. - Added
word_preprocessor_argsparameter torun_querythat allow specifying transformations prior to searching for words in word embeddings. - Added
secondary_preprocessor_argsparameter torun_querywhich allows specifying a second pre-processor transformation to words before searching them in word embeddings. It is not necessary to specify the first preprocessor to use this one. - Implemented
__getitem__function inWordEmbeddingModel. This method allows obtaining an embedding from a word from the model stored in the instance using indexers. - Removed underscore from c
Related Skills
node-connect
339.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
