Mesa
[NeurIPS’20] ⚖️ Build powerful ensemble class-imbalanced learning models via meta-knowledge-powered resampler. | 设计元知识驱动的采样器解决类别不平衡问题
Install / Use
/learn @ZhiningLiu1998/MesaREADME
MESA is a meta-learning-based ensemble learning framework for solving class-imbalanced learning problems. It is a task-agnostic general-purpose solution that is able to boost most of the existing machine learning models' performance on imbalanced data.
<!-- > **NOTE:** The paper will be available through [arXiv](https://arxiv.org/) in a few days. We will provide a link to the .pdf file ASAP. -->Cite Us
If you find this repository helpful in your work or research, we would greatly appreciate citations to the following paper:
@inproceedings{liu2020mesa,
title={MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler},
author={Liu, Zhining and Wei, Pengfei and Jiang, Jing and Cao, Wei and Bian, Jiang and Chang, Yi},
booktitle={Conference on Neural Information Processing Systems},
year={2020},
}
Table of Contents
- Cite Us
- Table of Contents
- Background
- Requirements
- Usage
- Visualization and Results
- Miscellaneous
- References
Background
About MESA
We introduce a novel ensemble imbalanced learning (EIL) framework named MESA. It adaptively resamples the training set in iterations to get multiple classifiers and forms a cascade ensemble model. MESA directly learns a parameterized sampling strategy (i.e., meta-sampler) from data to optimize the final metric beyond following random heuristics. It consists of three parts: meta sampling as well as ensemble training to build ensemble classifiers, and meta-training to optimize the meta-sampler.
The figure below gives an overview of the MESA framework.

Pros and Cons of MESA
Here are some personal thoughts on the advantages and disadvantages of MESA. More discussions are welcome!
Pros:
- 🍎 Wide compatiblilty.
We decoupled the model-training and meta-training process in MESA, making it compatible with most of the existing machine learning models. - 🍎 High data efficiency.
MESA performs strictly balanced under-sampling to train each base-learner in the ensemble. This makes it more data-efficient than other methods, especially on highly skewed data sets. - 🍎 Good performance.
The sampling strategy is optimized for better final generalization performance, we expect this can provide us with a better ensemble model. - 🍎 Transferability.
We use only task-agnostic meta-information during meta-training, which means that a meta-sampler can be directly used in unseen new tasks, thereby greatly reducing the computational cost brought about by meta-training.
Cons:
- 🍏 Meta-training cost.
Meta-training repeats the ensemble training process multiple times, which can be costly in practice (By shrinking the dataset used in meta-training, the computational cost can be reduced at the cost of minor performance loss). - 🍏 Need to set aside a separate validation set for training.
The meta-state is formed by computing the error distribution on both the training and validation sets. - 🍏 Possible unstable performance on small datasets.
Small datasets may cause the obtained error distribution statistics to be inaccurate/unstable, which will interfere with the meta-training process.
Requirements
Main dependencies:
- Python (>=3.5)
- PyTorch (=1.0.0)
- Gym (>=0.17.3)
- pandas (>=0.23.4)
- numpy (>=1.11)
- scikit-learn (>=0.20.1)
- imbalanced-learn (=0.5.0, optional, for baseline methods)
To install requirements, run:
pip install -r requirements.txt
NOTE: this implementation requires an old version of PyTorch (v1.0.0). You may want to start a new conda environment to run our code. The step-by-step guide is as follows (using torch-cpu for an example):
conda create --name mesa python=3.7.11conda activate mesaconda install pytorch-cpu==1.0.0 torchvision-cpu==0.2.1 cpuonly -c pytorchpip install -r requirements.txtThese commands should help you to get ready for running mesa. If you have any further questions, please feel free to open an issue or drop me an email.
Usage
A typical usage example:
# load dataset & prepare environment
args = parser.parse_args()
rater = Rater(args.metric)
X_train, y_train, X_valid, y_valid, X_test, y_test = load_dataset(args.dataset)
base_estimator = DecisionTreeClassifier()
# meta-training
mesa = Mesa(
args=args,
base_estimator=base_estimator,
n_estimators=10)
mesa.meta_fit(X_train, y_train, X_valid, y_valid, X_test, y_test)
# ensemble training
mesa.fit(X_train, y_train, X_valid, y_valid)
# evaluate
y_pred_test = mesa.predict_proba(X_test)[:, 1]
score = rater.score(y_test, y_pred_test)
Running main.py
Here is an example:
python main.py --dataset Mammo --meta_verbose 10 --update_steps 1000
You can get help with arguments by running:
python main.py --help
optional arguments:
# Soft Actor-critic Arguments
-h, --help show this help message and exit
--env-name ENV_NAME
--policy POLICY Policy Type: Gaussian | Deterministic (default:
Gaussian)
--eval EVAL Evaluates a policy every 10 episode (default:
True)
--gamma G discount factor for reward (default: 0.99)
--tau G target smoothing coefficient(τ) (default: 0.01)
--lr G learning rate (default: 0.001)
--lr_decay_steps N step_size of StepLR learning rate decay scheduler
(default: 10)
--lr_decay_gamma N gamma of StepLR learning rate decay scheduler
(default: 0.99)
--alpha G Temperature parameter α determines the relative
importance of the entropy term against the reward
(default: 0.1)
--automatic_entropy_tuning G
Automaically adjust α (default: False)
--seed N random seed (default: None)
--batch_size N batch size (default: 64)
--hidden_size N hidden size (default: 50)
--updates_per_step N model updates per simulator step (default: 1)
--update_steps N maximum number of steps (default: 1000)
--start_steps N Steps sampling random actions (default: 500)
--target_update_i
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
isf-agent
a repo for an agent that helps researchers apply for isf funding
last30days-skill
17.6kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
