Chameleon
Parametric models, and particularly neural networks, require weight initialization as a starting point for gradient-based optimization. In most current practices, this is accomplished by using some form of random initialization. Instead, recent work shows that a specific initial parameter set can be learned from a population of tasks, i.e., dataset and target variable for supervised learning tasks. Using this initial parameter set leads to faster convergence for new tasks (model-agnostic meta-learning). Currently, methods for learning model initializations are limited to a population of tasks sharing the same schema, i.e., the same number, order, type and semantics of predictor and target variables. In this paper, we address the problem of meta-learning parameter initialization across tasks with different schemas, i.e., if the number of predictors varies across tasks, while they still share some variables. We propose Chameleon, a model that learns to align different predictor schemas to a common representation. We use permutations and masks of the predictors of the training tasks at hand. In experiments on real-life data sets, we show that Chameleon successfully can learn parameter initializations across tasks with different schemas providing a 26\% lift on accuracy on average over random initialization and of 5\% over a state-of-the-art method for fixed-schema learning model initializations. To the best of our knowledge, our paper is the first work on the problem of learning model initialization across tasks with different schemas.
Install / Use
/learn @radrumond/ChameleonREADME
Chameleon V2
Created by: Lukas Brinkmeyer and Rafael Rego Drumond
CREDITS
This code is built on top of Reptile's original implenentation from:
Alex Nichol, Joshua Achiam, John Schulman
Website: https://openai.com/blog/reptile/
Git: https://github.com/openai/supervised-reptile
Paper: https://arxiv.org/abs/1803.02999
BIBTEX:
@article{nichol2018first,
title={On first-order meta-learning algorithms},
author={Nichol, Alex and Achiam, Joshua and Schulman, John},
journal={CoRR, abs/1803.02999},
volume={2},
year={2018}
}
If you use our code, please reference the paper above and our paper:
Chameleon: Learning Model Initializations Across Tasks With Different Schemas
Lukas Brinkmeyer and Rafael Rego Drumond
BIBTEX:
@misc{brinkmeyer2019chameleon,
title={Chameleon: Learning Model Initializations Across Tasks With Different Schemas},
author={Lukas Brinkmeyer and Rafael Rego Drumond and Randolf Scholz and Josif Grabocka and Lars Schmidt-Thieme},
year={2019},
eprint={1909.13576},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
DOWNLOADING DATASETS:
Most of the used datasets are in OpenML
Run the script openmldataset/openml_download.py to download the datasets used in the paper.
In each folder you will have features.npy and labels.npy, if you want combined experiments copy from the other folders as features_test.npy and labels_test.npy to be used as meta-test-set
RECOMMENDED PACKAGES:
You can check our recommended packages in the file recommended.txt
ARGUMENTS:
Run the run.py to run the code, you can use the following arguments.
Resulting learning curves will be saved in the results folder
--seed SEED random seed (default: 0)
--checkpoint CHECKPOINT
checkpoint directory (default: model_checkpoint)
--save_path SAVE_PATH
checkpoint directory (default: exp_1581008235)
--num_jobs NUM_JOBS Number of jobs to run in parallel (default: 5)
--inner_batch INNER_BATCH
inner batch size (default: 30)
--inner_iters INNER_ITERS
inner iterations (default: 10)
--learning_rate LEARNING_RATE
Adam step size (default: 0.0001)
--meta_step META_STEP
meta-training step size (default: 0.01)
--meta_batch META_BATCH
meta-training batch size (default: 1)
--meta_iters META_ITERS
meta-training iterations (default: 15001)
--min_feats MIN_FEATS
Min number of features (default: 4)
--max_feats MAX_FEATS
Max number of features (default: 8)
--freeze FREEZE whether a permuting network is added (default: False)
--conv_layers CONV_LAYERS
Number and size of conv layers (default: [8,16,14])
--base_layers BASE_LAYERS
Number and size of base layers (default: [64,64])
--perm_epochs PERM_EPOCHS
training epochs for permuter (default: 501)
--perm_lr PERM_LR permuter learning rate (default: 0.0001)
--num_test_features NUM_TEST_FEATURES
Ratio of feature split for train test (default: 0)
--test_feat_ratio TEST_FEAT_RATIO
Ratio of feature split for train test (default: 0.0)
--name NAME name add-on (default: Model_config-1581008235)
--dataset DATASET data set to evaluate on (default: codrna)
--data_dir DATA_DIR Path to datasets (default: ./Data/selected)
--config CONFIG json config file (default: None)
--inner_iters 5 --meta_iters 5 --perm_epochs 5
