FSAF

FSAF (Feature Selective Anchor-Free Module for Single-Shot Object Detection) implementation in Keras and Tensorflow

Generate Convert Improve

Install / Use

/learn @xuannianz/FSAF

About this skill

Quality Score

0/100

README

FSAF

This is an implementation of FSAF on keras and Tensorflow. The project is based on fizyr/keras-retinanet and fsaf branch of zccstig/mmdetection. Thanks for their hard work.

As the authors write, FASF module can be plugged into any single-shot detectors with FPN-like structure smoothly. I have also tried on yolo3. Anchor-free yolo3(with FSAF) gets a comparable performance with the anchor-based counterpart. But you don't need to pre-compute the anchor sizes any more. And it is much better and faster than the one based on retinanet.

Updates

[03/05/2020] The author of the paper has released a new paper SAPD, which is based on FSAF. I have implemented it at xuannianz/SAPD.

Test

I trained on Pascal VOC2012 trainval.txt + Pascal VOC2007 train.txt, and validated on Pascal VOC2007 val.txt. There are 14041 images for training and 2510 images for validation.
The best evaluation results (score_threshold=0.05) on VOC2007 test are:

| backbone | mAP<sub>50</sub> | | ---- | ---- | | resnet50 | 0.7248 | | resnet101 | 0.7652 |

Pretrained models are here.
baidu netdisk extract code: rbrr
goole dirver
python3 inference.py to test your image by specifying image path and model path there.

Train

build dataset (Pascal VOC, other types please refer to fizyr/keras-retinanet)

Download VOC2007 and VOC2012, copy all image files from VOC2007 to VOC2012.
Append VOC2007 train.txt to VOC2012 trainval.txt.
Overwrite VOC2012 val.txt by VOC2007 val.txt.

train

python3 train.py --backbone resnet50 --gpu 0 --random-transform pascal datasets/VOC2012 to start training.

Evaluate

python3 utils/eval.py to evaluate by specifying model path there.

Related Skills

node-connect

344.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

99.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。