MinimaxFilter
A learning approach to privacy preservation against inference attacks
Install / Use
/learn @hammlab/MinimaxFilterREADME
Minimax Filter
can preserve privacy of images, audios, or biometric data by making it difficult for an adversary to infer sensitive or identifying information from those data after filtering.

Abstract
Preserving privacy of continuous and/or high-dimensional data such as images, videos and audios, can be challenging with syntactic anonymization methods such as k-anonymity which are designed for discrete attributes. Differential privacy, which provides a different and more formal type of privacy, has shown more success in sanitizing continuous data. However, both syntactic and differential privacy are susceptible to inference attacks, i.e., an adversary can accurately guess sensitive attributes from insensitive attributes. This paper proposes a learning approach to finding a minimax filter of raw features which retains infor- mation for target tasks but removes information from which an adversary can infer sensitive attributes. Privacy and utility of filtered data are measured by expected risks, and an opti- mal tradeoff of the two goals is found by a variant of minimax optimization. Generalization performance of the empirical solution is analyzed and and a new and simple optimization algorithm is presented. In addition to introducing minimax filter, the paper proposes noisy minimax filter that combines minimax filter and differentially private noisy mechanism, and compare resilience to inference attack and differentially privacy both quantitatively and qualitatively. Experiments with several real-world tasks including facial expression recognition, speech emotion recognition, and activity recognition from motion, show that the minimax filter can simultaneously achieve similar or better target task accuracy and lower inference accuracy, often significantly lower, than previous methods.
Getting Started
1. Download all files in src/ and test/
Make sure you can access scripts in /src, for example by downloading files from both /src and /test into the same folder. Description of the scripts are in src/readme.md. The Genki dataset test/genki.mat is originally downloaded from http://mplab.ucsd.edu.
2. Run test/test_NN_genki.py
The task is to learn a filter of face image from the Genki dataset which allows accurate classification of 'smile' vs 'non-smile' but prevents accurate classification of 'male' vs 'female'.
The script finds a minimax filter by alternating optimization. The filer is a two-layer sigmoid neural net and the classifiers are softmax classifiers.
The script will run for a few minutes on a desktop. After 50 iterations, the filter will achieve ~88% accuracy in facial expression classification and ~66% accuracy in gender classification.
minimax-NN: rho=10.000000, d=10, trial=0, rate1=0.88, rate2=0.66
Results will be save to a file named 'test_NN_genki.npz'
3. Run test/test_all_genki.py
The task is the same as before (accurate facial expression and inaccurate gender classification.)
The script trains and compares several private and non-private algorithms for the same task, including a linear minimax filter.
The script will also run for a few minutes on a desktop. You will see similar results as follows.
rand: d=10, trial=0, rate1=0.705000, rate2=0.705000
pca: d=10, trial=0, rate1=0.840000, rate2=0.665000
pls: d=10, trial=0, rate1=0.850000, rate2=0.685000
alt: rho=10.000000, d=10, trial=0, rate1=0.825000, rate2=0.520000
Here 'alt' is the linear minimax filter, 'rate1' is the accuracy of expression classification and 'rate 2' is the accuracy of gender classification.
Reference
- J. Hamm, "Preserving privacy of continuous high-dimensional data with minimax filters," AISTATS, 2015
- J. Hamm, "Mimimax Filter: Learning to Preserve Privacy from Inference Attacks," arXiv:1610.03577, 2016
License
Released under the Apache License 2.0. See the LICENSE.txt file for further details.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
fullstack-developer
Full-Stack Developer Role Role Definition CONCEPT: Full-stack developer expertise ARCHITECTURE: Covers both frontend and backend development BEST_PRACTICE: Comprehensive web applicat
groundhog
401Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
workshop-rules
Materials used to teach the summer camp <Data Science for Kids>
