LDConv

No description available

Generate Convert Improve

Install / Use

/learn @CV-ZhangXin/LDConv

About this skill

Quality Score

0/100

README

LDConv: Linear deformable convolution for improving convolutioanl neural networks (Image and Vision Computing)

This repository is a PyTorch implementation of our paper: LDConv: Linear deformable convoluton for improving convolutioanl neural networks.

If you are interested in our other work, you can find information on https://github.com/Liuchen1997/RFAConv.

The relevant interpolation codes and resampling codes are referenced at https://github.com/dontLoveBugs/Deformable_ConvNet_pytorch.

The code has been opened, thank you for your support.

LDConv provides kernels of different sizes for efficient extraction of features.

Kernels-samples

Object detection based on COCO2017 and YOLOv5

| Models | LDConv | AP50 | AP75 | AP | APS | APM | APL | GFLOPS | Params (M) | |-----------|--------|------|------|------|------|------|------|--------|------------| | YOLOv5n (Baseline) | - | 45.6 | 28.9 | 27.5 | 13.5 | 31.5 | 35.9 | 4.5 | 1.87 | | | 3 | 47.8 | 31 | 29.8 | 14.5 | 33.2 | 41 | 3.8 | 1.51 | | YOLOv5n | 5 | 48.8 | 32.6 | 31 | 14.6 | 34.1 | 43.2 | 4.1 | 1.65 | | | 9 | 50.5 | 33.9 | 32.3 | 14.9 | 36.1 | 44.1 | 4.8 | 1.94 | | | 13 | 51.2 | 34.5 | 33 | 15.7 | 36.3 | 45.6 | 5.5 | 2.23 | | YOLOv5s (Baseline) | - | 57 | 39.9 | 37.1 | 20.9 | 42.4 | 47.8 | 16.4 | 7.23 | | | 4 | 58.2 | 41.9 | 39.2 | 21.4 | 43.2 | 53.4 | 14.1 | 6.01 | | YOLOv5s | 6 | 59.2 | 42.6 | 39.9 | 21.5 | 44.2 | 54.7 | 15.3 | 6.55 | | | 7 | 59.4 | 43.2 | 40.4 | 21.5 | 44.6 | 55.1 | 15.9 | 6.82 |

Object detection based on VOC 7+12 and YOLOv7

| Models | LDConv | Precision | Recall | mAP50 | mAP | FLOPS | Params | |-------------|--------|-----------|--------|-------|------|-------|--------| | YOLOv7-tiny (Baseline) | - | 77.3 | 69.8 | 76.4 | 50.2 | 13.2 | 6.06 | | | 3 | 80.1 | 68.4 | 76.1 | 50.3 | 12.1 | 5.56 | | | 4 | 78.2 | 70.3 | 76.2 | 50.7 | 12.4 | 5.66 | | YOLOv7-tiny | 5 | 77 | 71.1 | 76.5 | 50.8 | 12.6 | 5.75 | | | 6 | 79.6 | 69.9 | 76.9 | 51 | 12.9 | 5.85 | | | 8 | 78.6 | 70.1 | 76.7 | 51.2 | 13.4 | 6.04 | | | 9 | 81 | 69.3 | 76.7 | 51.3 | 13.7 | 6.14 |

Object detection based on VisDrone-DET2021 and YOLOv5

| Models | LDConv | Precision | Recall | mAP50 | mAP | FLOPS | Params (M) | |---------|--------|-----------|--------|-------|------|-------|------------| | YOLOv5n (Baseline) | - | 38.5 | 28 | 26.4 | 13.4 | 4.2 | 1.77 | | | 3 | 37.9 | 27.4 | 25.9 | 13.2 | 3.5 | 1.41 | | | 5 | 40 | 28 | 26.9 | 13.7 | 3.8 | 1.56 | | | 6 | 38.1 | 28.1 | 26.8 | 13.6 | 4 | 1.63 | | YOLOv5n | 7 | 39.8 | 28.2 | 27.5 | 14.2 | 4.2 | 1.7 | | | 9 | 39.7 | 28.9 | 27.7 | 14.3 | 4.5 | 1.84 | | | 11 | 40.4 | 28.8 | 27.7 | 14.2 | 4.8 | 1.99 | | | 14 | 40 | 28.8 | 27.9 | 14.3 | 5.3 | 2.2 |

Comparison experiments

| Models | AP50 | AP75 | AP | APS | APM | APL | GFLOPS | Params (M) | |-------------------------------|------|------|------|------|------|------|--------|------------| | YOLOv5s | 54.8 | 37.5 | 35 | 19.2 | 40 | 45.2 | 16.4 | 7.23 | | YOLOv5s (DSConv =5) | 43.2 | 23.5 | 23.9 | 13.0 | 27.6 | 30.5 | 14.8 | 6.45 | | YOLOv5s (LDConv=5) | 56.6 | 40.7 | 38 | 20.8 | 41.8 | 52 | 14.8 | 6.54 | | YOLOv5s (LDConv=9) | 57.8 | 41.4 | 38.7 | 20.8 | 42.8 | 52.3 | 17.1 | 7.37 | | YOLOv5s (LDConv=9, padding) | 58.3 | 41.9 | 39.2 | 21.6 | 43.2 | 53.5 | 17.1 | 7.37 | | YOLOv5s (Deformable Conv = 3) | 58.5 | 41.8 | 39.1 | 20.8 | 43.4 | 53.6 | 17.1 | 7.37 | | YOLOv5s (LDConv=11) | 58.5 | 42.1 | 39.3 | 21.9 | 43.3 | 53.8 | 18.3 | 7.91 | | YOLOv5s (LDConv=11, padding) | 58.6 | 42.1 | 39.5 | 21.3 | 43.7 | 53.2 | 18.3 | 7.91 |

Comparison experiments

| Models | Precision | Recall | mAP50 | mAP | GFLOPS | Params (M) | |--------------------|-----------|--------|-------|------|--------|------------| | YOLOv5n | 73.8 | 62.2 | 68.1 | 41.5 | 4.2 | 1.77 | | YOLOv5n (DSConv=4) | 63 | 50.4 | 54.2 | 26.1 | 3.7 | 1.55 | | YOLOv5n (LDConv=4) | 76.5 | 63.6 | 70.8 | 46.5 | 3.7 | 1.55 | | YOLOv5n (DSConv=9) | 60.6 | 50.8 | 53.4 | 25.3 | 4.8 | 1.9 | | YOLOv5n (LDConv=9) | 76.7 | 65.2 | 71.8 | 48.4 | 4.8 | 1.9 |

Exploring experiments

| Models | AP50 | AP75 | AP | APS | APM | APL | GFLOPS | Params (M) | |-------------------|------|------|------|------|------|------|--------|------------| | YOLOv8n | 49.0 | 37.1 | 34.2 | 16.9 | 37.1 | 49.1 | 8.7 | 3.15 | | YOLOv8n-5 (Sampled Shape 1) | 49.5 | 37.6 | 34.9 | 16.8 | 38.2 | 50.2 | 8.4 | 2.94 | | YOLOv8n-5 (Sampled Shape 2) | 49.6 | 37.8 | 34.9 | 15.9 | 38.4 | 50.1 | 8.4 | 2.94 | | YOLOv8n-5 (Sampled Shape 3) | 49.6 | 38.1 | 35 | 16.6 | 38.2 | 50.9 | 8.4 | 2.94 | | YOLOv8n-6 (Sampled Shape 1) | 50.1 | 38.3 | 35.3 | 16.6 | 38.6 | 51.1 | 8.6 | 3.01 | | YOLOv8n-6 (Sampled Shape 2) | 50.2 | 38.2 | 35.4 | 16.6 | 38.3 | 51.3 | 8.6 | 3.01 |

| Models |Initial Shape| Precision | Recall | mAP50 | mAP | |-------------------|-----------|-----------|--------|-------|------| | YOLOv5n |a | 39.5 | 27.9 | 26.9 | 13.7 | | YOLOv5n |b | 39.4 | 28.2 | 26.8 | 13.6 | | YOLOv5n |c | 37.4 | 27.8 | 26.1 | 13.4 | | YOLOv5n |d | 37.5 | 27 | 25.5 | 12.9 | | YOLOv5n |e | 38.4 | 27.6 | 26.4 | 13.4 |

Citation

You may want to cite:

@inproceedings{dai2017deformable,
  title={Deformable convolutional networks},
  author={Dai, Jifeng and Qi, Haozhi and Xiong, Yuwen and Li, Yi and Zhang, Guodong and Hu, Han and Wei, Yichen},
  booktitle={Proceedings of the IEEE international conference on computer vision},
  pages={764--773},
  year={2017}
}

@article{zhang2024ldconv,
  title={LDConv: Linear deformable convolution for improving convolutional neural networks},
  author={Zhang, Xin and Song, Yingze and Song, Tingting and Yang, Degang and Ye, Yichen and Zhou, Jie and Zhang, Liming},
  journal={Image and Vision Computing},
  pages={105190},
  year={2024},
  publisher={Elsevier}
}

Related Skills

node-connect

352.0k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.0k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.0k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。