LargeKernel3D

LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs (CVPR 2023)

Generate Convert Improve

Install / Use

/learn @JIA-Lab-research/LargeKernel3D

About this skill

Quality Score

0/100

README

LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs (CVPR 2023)

This is the implementation of LargeKernel3D (CVPR 2023). Large kernels are important but expensive in 3D CNNs. We propose spatial-wise partition to conv enable 3D large kernels. High performance on 3D semantic segmentation & object detection. For more details, please refer to:

LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs [Paper] Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia

Experimental results

| nuScenes Object Detection | Set | mAP | NDS | Download | |------------------------------------------------------------------------------------------------------------------------------|:--------------:|:----:|:----:|:------------------------------:| | LargeKernel3D | val | 63.3 | 69.1 | Pre-trained | | LargeKernel3D | test | 65.4 | 70.6 | Pre-trained Submission | | +test aug | test | 68.7 | 72.8 | Submission | | LargeKernel3D-F | test | - | - | Pre-trained | | +test aug | test | 71.1 | 74.2 | Submission |

| ScanNetv2 Semantic Segmentation | Set | mIoU | Download | |-------------------------------------------------------------------------------------|:----:|:----:|:-----------------------------------------------------------------------------------------------------:| | LargeKernel3D | val | 73.5 | [Pre-trained] | | LargeKernel3D | test | 73.9 | [Submission] |

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{chen2023largekernel3d,
  title={LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs},
  author={Yukang Chen and Jianhui Liu and Xiangyu Zhang and Xiaojuan Qi and Jiaya Jia},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2023}
}

Acknowledgement

This work is built upon FocalsConv for object detection.
This work is built upon Stratified-Transformer for semantic segmentation.

Our Works in LiDAR-based 3D Computer Vision

VoxelNeXt (CVPR 2023) [Paper] [Code] Fully Sparse VoxelNet for 3D Object Detection and Tracking.
Focal Sparse Conv (CVPR 2022 Oral) [Paper] [Code] Dynamic sparse convolution for high performance.
Spatial Pruned Conv (NeurIPS 2022) [Paper] [Code] 50% FLOPs saving for efficient 3D object detection.
LargeKernel3D (CVPR 2023) [Paper] [Code] Large-kernel 3D sparse CNN backbone.
SphereFormer (CVPR 2023) [Paper] [Code] Spherical window 3D transformer backbone.
spconv-plus A library where we combine our works into spconv.
SparseTransformer A library that includes high-efficiency transformer implementations for sparse point cloud or voxel data.

License

This project is released under the Apache 2.0 license.

Related Skills

node-connect

350.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。