TopoFilter
NeurIPS 2020, "A Topological Filter for Learning with Label Noise".
Install / Use
/learn @pxiangwu/TopoFilterREADME
A Topological Filter for Learning with Label Noise (NeurIPS 2020, Paper)

Requirements
- PyTorch 0.4.1 (have not tested on other versions)
- Python 3.6 (for the purpose of compiling C++ code. Other 3.x versions should also work.)
- scipy 1.1.0 (this is due to the computation of distribution mode)
- termcolor, etc (which can be easily installed with pip)
Usage
- Compile the C++ code for computing the connected components. In folder
ref, run./compile_pers_lib.sh(by default it requires Python 3.6. If you are using other Python versions, modify the command insidecompile_pers_lib.sh). - Run
train.pywith the commands like below:
python train.py --every 5 --start_clean 30 --k_cc 4 --k_outlier 32 --seed 77 --type uniform --noise 0.4 --patience 65 --gpus 0 --dataset cifar10 --zeta 0.5
- For point cloud dataset, run the command with
pcargument:
python train.py --gpus 2 --every 5 --start_clean 10 --k_outlier 30 --k_cc 100 --noise 0.8 --type uniform --patience 60 --seed 77 --dataset pc --net pc --milestone 35 --zeta 2
Here the major parameters are:
every: the frequency of data collection.start_clean: when to start data collection.k_cc: the parameter for computing the KNN graph when finding the largest connected component.k_outlier: the parameter for computing the KNN graph when applying zeta filtering.seed: the random seed.type: the noise type. Options includeuniformandasym.noise: the noise level.patience: this is a trick to save training time. If we observe no obvious improvement of validation accuracy for a consecutive number ofNepochs, we stop the training.gpus: run on which GPU.dataset: which dataset to use. Options includecifar10,cifar100andpc. For thepcdataset, it can be downloaded from https://github.com/charlesq34/pointnetzeta: the parameter forzetafiltering. Note that, when setting zeta to be > 1.0, we will use majority voting to remove the outliers. This sometimes achieves better performance.
Practical tips: For the extrmely noisy scenarios (noise level >= 0.8), we observe setting a larger k_cc is better.
Our code will be further improved to make it cleaner and easier to use.
Reference:
@inproceedings{wu2020topological,
title={A Topological Filter for Learning with Label Noise},
author={Wu, Pengxiang and Zheng, Songzhu and Goswami, Mayank and Metaxas, Dimitris and Chen, Chao},
booktitle={Advances in Neural Information Processing Systems},
year={2020}
}
Related Works:
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
research_rules
Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
