MKT
Official implementation of "Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer".
Install / Use
/learn @sunanhe/MKTREADME
Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer (AAAI 2023 Oral)

This is the official repository of our paper Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer.
Setup
pip install -r requirements.txt
Preparation
-
Download pretrained VLP(ViT-B/16) model from OpenAI CLIP.
-
Download images of NUS-WIDE dataset from NUS-WIDE.
-
Download other files from here.
The organization of the dataset directory is shown as follows.
NUS-WIDE
├── features
├── Flickr
├── Concepts81.txt
├── Concepts925.txt
├── img_names.pkl
├── label_emb.pt
└── test_img_names.pkl
Training MKT on NUS-WIDE
python3 train_nus_first_stage.py \
--data-path path_to_dataset \
--clip-path path_to_clip_model
The checkpoint of the first training stage is here.
python3 -m torch.distributed.launch --nproc_per_node=8 train_nus_second_stage.py \
--data-path path_to_dataset \
--clip-path path_to_clip_model \
--ckpt-path path_to_first_stage_ckpt
The checkpoint of the second training stage is here.
Testing MKT on NUS-WIDE
python3 train_nus_second_stage.py --eval \
--data-path path_to_dataset \
--clip-path path_to_clip_model \
--ckpt-path path_to_first_stage_ckpt \
--eval-ckpt path_to_first_second_ckpt \
Inference on A Single Image
python3 inference.py \
--data-path path_to_dataset \
--clip-path path_to_clip_model \
--img-ckpt path_to_first_stage_ckpt \
--txt-ckpt path_to_second_stage_ckpt \
--image-path figures/test.jpg
Acknowledgement
We would like to thank BiAM and timm for the codebase.
License
MKT is MIT-licensed. The license applies to the pre-trained models as well.
Citation
Consider cite MKT in your publications if it helps your research.
@article{he2022open,
title={Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer},
author={He, Sunan and Guo, Taian and Dai, Tao and Qiao, Ruizhi and Ren, Bo and Xia, Shu-Tao},
journal={arXiv preprint arXiv:2207.01887},
year={2022}
}
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
research_rules
Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie
