OneNIP
[ECCV 2024] Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt
Install / Use
/learn @gaobb/OneNIPREADME
[ECCV 2024] OneNIP
Official PyTorch Implementation of Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt, Accepted by ECCV 2024.

OneNIP mainly consists of Unsupervised Reconstruction, Unsupervised Restoration, and Supervised Refiner. Unsupervised Reconstruction and Unsupervised Restoration share the same encoder-decoder architectures and weights. Supervised Refiner is implemented by two transposed convolution blocks, and each following a 1×1 convolution layer.
- Unsupervised Reconstruction reconstructs normal tokens;
- Unsupervised Restoration restores pseudo anomaly tokens to the corresponding normal tokens;
- Supervised Refiner refines reconstruction/restoration errors to achieve more accurate anomaly segmentation.
1. Comparsions of OneNIP and UniAD

2. Results and checkpoints-v2.
All pre-trained model weights are stored in Google Drive.
| Dataset | Input-Reslution | I-AUROC | P-AUROC | P-AUAP | checkpoints-v2 | Test-Log| | ------ | ------ | ------ | ------ | ------ | ------ | ------ | | Medical | 224 $\times$ 224 | 84.1 | 97.3 | 46.1 | model weight | testlog | | MVTec | 224 $\times$ 224 | 98.0 | 97.9 | 63.9 | model weight | testlog | | MVTec | 256 $\times$ 256 | 97.9 | 97.9 | 64.8 | model weight | testlog | | MVTec | 320 $\times$ 320 | 98.4 | 98.0 | 66.7 | model weight | testlog | || | VisA | 224 $\times$ 224 | 92.8 | 98.7 | 42.5 | model weight | testlog | | VisA | 256 $\times$ 256 | 93.4 | 98.9 | 44.9 | model weight | testlog | | VisA | 320 $\times$ 320 | 94.8 | 98.9 | 46.1 | model weight | testlog | || | BTAD | 224 $\times$ 224 | 93.2 | 97.4 | 56.3 | model weight | testlog | | BTAD | 256 $\times$ 256 | 95.2 | 97.6 | 57.7 | model weight | testlog | | BTAD | 320 $\times$ 320 | 96.0 | 97.8 | 58.6 | model weight | testlog | || |MVTec+VisA+BTAD| 224 $\times$ 224 | 94.6 | 98.0 | 53.5 | model weight | testlog | || |MVTec+VisA+BTAD| 256 $\times$ 256 | 94.9 | 98.0 | 53.1 | model weight | testlog | || |MVTec+VisA+BTAD| 320 $\times$ 320 | 95.6 | 97.9 | 54.1 | model weight | testlog | ||
<!-- | Dataset | Input-Reslution | I-AUROC | P-AUROC | P-AUAP | checkpoints-v2 | Test-Log| | ------ | ------ | ------ | ------ | ------ | ------ | ------ | | MVTec | 224 $\times$ 224 | 97.9 | 97.9 | 63.7 | [model weight](https://drive.google.com/file/d/1q6gMbBKrF-sM1822KlFhmj-jCbMEdBMa/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-mvtec-4-4-224/log/dec_20240921_215951.log) | | MVTec | 256 $\times$ 256 | 97.6 | 97.9 | 64.8 | [model weight](https://drive.google.com/file/d/1eVXrncc7iRtaNQpyHk3cQ1QQlPBhXpuF/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-mvtec-4-4-256/log/dec_20240921_220207.log) | | MVTec | 320 $\times$ 320 | 97.9 | 97.9 | 65.9 | [model weight](https://drive.google.com/file/d/19xK8nksu1uBG-Affbcu6cZmaj10cMeSL/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-mvtec-4-4-320/log/dec_20240921_220430.log) | || | VisA | 224 $\times$ 224 | 92.5 | 98.7 | 43.3 | [model weight](https://drive.google.com/file/d/16r5pq5CBVPgu2jMizVJW83K0oB_xdmFl/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-visa-4-4-224/log/dec_20240921_221901.log) | | VisA | 256 $\times$ 256 | 93.1 | 98.8 | 44.9 | [model weight](https://drive.google.com/file/d/1ZV2Hh5oniMW1cePsRQ_RPgkCBIOQHoIi/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-visa-4-4-256/log/dec_20240921_225047.log) | | VisA | 320 $\times$ 320 | 94.2 | 98.8 | 46.1 | [model weight](https://drive.google.com/file/d/17DX4ukJIzMAKYfLPMu1yp3VbvFfXNCvo/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-visa-4-4-320/log/dec_20240921_220825.log) | || | BTAD | 224 $\times$ 224 | 92.6 | 97.4 | 56.8 | [model weight](https://drive.google.com/file/d/1drMQZubI3dFz0yNXJuyTOU4DmmFkGNEc/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-btad-4-4-224/log/dec_20240921_221227.log) | | BTAD | 256 $\times$ 256 | 94.6 | 97.6 | 57.0 | [model weight](https://drive.google.com/file/d/1avzuJQLd2Xd_7hUEG25s1ev7cMqNUNJz/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-btad-4-4-256/log/dec_20240921_221334.log) | | BTAD | 320 $\times$ 320 | 95.3 | 97.8 | 57.6 | [model weight](https://drive.google.com/file/d/1jRyIrwR96tAgjvdvLJ8346Hylugr0rmu/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-btad-4-4-320/log/dec_20240921_235736.log) | || |MVTec+VisA+BTAD| 224 $\times$ 224 | 94.5 | 98.0 | 52.4 | [model weight](https://drive.google.com/file/d/17sccEGFcFYFOwDp6e3Mh0a5QK8iOUeFT/view?usp=sharing) | [testlog](./checkpoints-v2/onenip-mvtec+btad+visa-4-4-224/log/dec_20240921_230615.log) | || -->3. Evaluation and Training
3.1 Prepare data
Download MVTec, BTAD, VisA and DTD datasets. Unzip and move them to ./data. The data directory should be as follows.
├── data
│ ├── btad
│ │ ├── 01
│ │ ├── 02
│ │ ├── 03
│ │ ├── test.json
│ │ ├── train.json
│ ├── dtd
│ │ ├── images
│ │ ├── imdb
│ │ ├── labels
│ ├── mvtec
│ │ ├── bottle
│ │ ├── cable
│ │ ├── ...
│ │ └── zipper
│ │ ├── test.json
│ │ ├── train.json
│ ├── mvtec+btad+visa
│ │ ├── 01
│ │ ├── bottle
│ │ ├── ...
│ │ └── zipper
│ │ ├── test.json
│ │ ├── train.json
│ ├── visa
│ │ ├── candle
│ │ ├── capsules
│ │ ├── ...
│ │ ├── pipe_fryum
│ │ ├── test.json
│ │ ├── train.json
3.2 Evaluation with pre-trained checkpoints-v2
Download pre-trained checkpoints-v2 to ./checkpoints-v2
cd ./exps
bash eval_onenip.sh 8 0,1,2,3,4,5,6,7
3.3 Training OneNIP
cd ./exps
bash train_onenip.sh 8 0,1,2,3,4,5,6,7
Citing
If you find this code useful in your research, please consider citing us:
@inproceedings{gao2024onenip,
title={Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt},
author={Gao, Bin-Bin},
booktitle={18th European Conference on Computer Vision (ECCV 2024)},
pages={-},
year={2024}
}
Acknowledgement
Our OneNIP is built on UniAD. Thank the authors of UniAD for open-sourcing their implementation codes!
Star History
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
