SkillAgentSearch skills...

MissingDataOT

A Pytorch implementation of missing data imputation using optimal transport.

Install / Use

/learn @BorisMuzellec/MissingDataOT
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Missing Data Imputation using Optimal Transport

Overview

This repository complements the paper Missing Data Imputation using Optimal Transport (Muzellec B., Josse J., Boyer C., Cuturi, M.):

  • experiment.py allows to reproduce the imputation benchmark therein;
  • imputers.py contains the classes corresponding to algorithms 1 and 3;
  • data_loaders.py contains data loading utilities for the UCI ML repository datasets on which experiments are run;
  • utils.py contains methods of general utility, and the implementation of MAR and MNAR missing data mechanisms in particular;
  • softimpute.py contains the implementation of the softimpute baseline.

An example notebook is also available: UCI_demo.ipynb.

References

Muzellec B., Josse J., Boyer C., Cuturi, M.: Missing Data Imputation using Optimal Transport

@inproceedings{muzellec2020missing,
  title={Missing Data Imputation using Optimal Transport},
  author={Muzellec, Boris and Josse, Julie and Boyer, Claire and Cuturi, Marco},
  booktitle={International Conference on Machine Learning},
  pages={7130--7140},
  year={2020},
  organization={PMLR}
}

Dependencies

To use the data loading utilities in data_loaders.py, wget is also required.

View on GitHub
GitHub Stars104
CategoryDevelopment
Updated22d ago
Forks19

Languages

Jupyter Notebook

Security Score

80/100

Audited on Mar 11, 2026

No findings