ICAN
[BMVC 2018] iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
Install / Use
/learn @vt-vl-lab/ICANREADME
This repository is no longer maintained. I am no longer actively maintaining iCAN. Please refer to our ECCV 2020 work DRG for a stronger HOI detection framework in PyTorch.
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
Official TensorFlow implementation for iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection.
See the project page for more details. Please contact Chen Gao (chengao@vt.edu) if you have any questions.
<img src='misc/HOI.gif'>Prerequisites
This codebase was developed and tested with Python2.7, Tensorflow 1.1.0 or 1.2.0, CUDA 8.0 and Ubuntu 16.04.
Installation
- Clone the repository.
git clone https://github.com/vt-vl-lab/iCAN.git - Download V-COCO and HICO-DET dataset. Setup V-COCO and COCO API. Setup HICO-DET evaluation code.
chmod +x ./misc/download_dataset.sh ./misc/download_dataset.sh # Assume you cloned the repository to `iCAN_DIR'. # If you have downloaded V-COCO or HICO-DET dataset somewhere else, you can create a symlink # ln -s /path/to/your/v-coco/folder Data/ # ln -s /path/to/your/hico-det/folder Data/
Evaluate V-COCO and HICO-DET detection results
- Download detection results
chmod +x ./misc/download_detection_results.sh ./misc/download_detection_results.sh - Evaluate V-COCO detection results using iCAN
python tools/Diagnose_VCOCO.py eval Results/300000_iCAN_ResNet50_VCOCO.pkl - Evaluate V-COCO detection results using iCAN (Early fusion)
python tools/Diagnose_VCOCO.py eval Results/300000_iCAN_ResNet50_VCOCO_Early.pkl - Evaluate HICO-DET detection results using iCAN
Here we evaluate our best detection results undercd Data/ho-rcnn matlab -r "Generate_detection; quit" cd ../../Results/HICO_DET/1800000_iCAN_ResNet50_HICO. If you want to evaluate a different detection result, please specify the filename inData/ho-rcnn/Generate_detection.maccordingly.
Error diagnose on V-COCO
- Diagnose V-COCO detection results using iCAN
python tools/Diagnose_VCOCO.py diagnose Results/300000_iCAN_ResNet50_VCOCO.pkl - Diagnose V-COCO detection results using iCAN (Early fusion)
python tools/Diagnose_VCOCO.py diagnose Results/300000_iCAN_ResNet50_VCOCO_Early.pkl
Training
- Download COCO pre-trained weights and training data
chmod +x ./misc/download_training_data.sh ./misc/download_training_data.sh - Train an iCAN on V-COCO
python tools/Train_ResNet_VCOCO.py --model iCAN_ResNet50_VCOCO --num_iteration 300000 - Train an iCAN (Early fusion) on V-COCO
python tools/Train_ResNet_VCOCO.py --model iCAN_ResNet50_VCOCO_Early --num_iteration 300000 - Train an iCAN on HICO-DET
python tools/Train_ResNet_HICO.py --num_iteration 1800000
Testing
- Test an iCAN on V-COCO
python tools/Test_ResNet_VCOCO.py --model iCAN_ResNet50_VCOCO --num_iteration 300000 - Test an iCAN (Early fusion) on V-COCO
python tools/Test_ResNet_VCOCO.py --model iCAN_ResNet50_VCOCO_Early --num_iteration 300000 - Test an iCAN on HICO-DET
python tools/Test_ResNet_HICO.py --num_iteration 1800000
Visualizing V-COCO detections
Check tools/Visualization.ipynb to see how to visualize the detection results.
Demo/Test on your own images
- To get the best performance, we use Detectron as our object detector. For a simple demo purpose, we use tf-faster-rcnn in this section instead.
- Clone and setup the tf-faster-rcnn repository.
cd $iCAN_DIR chmod +x ./misc/setup_demo.sh ./misc/setup_demo.sh - Put your own images to
demo/folder. - Detect all objects
# images are saved in $iCAN_DIR/demo/ python ../tf-faster-rcnn/tools/Object_Detector.py --img_dir demo/ --img_format png --Demo_RCNN demo/Object_Detection.pkl - Detect all HOIs
python tools/Demo.py --img_dir demo/ --Demo_RCNN demo/Object_Detection.pkl --HOI_Detection demo/HOI_Detection.pkl - Check
tools/Demo.ipynbto visualize the detection results.
Citation
If you find this code useful for your research, please consider citing the following papers:
@inproceedings{gao2018ican,
author = {Gao, Chen and Zou, Yuliang and Huang, Jia-Bin},
title = {iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection},
booktitle = {British Machine Vision Conference},
year = {2018}
}
Acknowledgement
Codes are built upon tf-faster-rcnn. We thank Jinwoo Choi for the code review.
Related Skills
node-connect
343.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
90.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
