Revisiting Image Pyramid Structure for High Resolution Salient Object Detection (InSPyReNet)

Official PyTorch implementation of PyTorch implementation of Revisiting Image Pyramid Structure for High Resolution Salient Object Detection (InSPyReNet)

To appear in the 16th Asian Conference on Computer Vision (ACCV2022)

Taehun Kim, Kunhee Kim, Joonyeong Lee, Dongmin Cha, Jiho Lee, Daijin Kim

Abstract: Salient object detection (SOD) has been in the spotlight recently, yet has been studied less for high-resolution (HR) images. Unfortunately, HR images and their pixel-level annotations are certainly more labor-intensive and time-consuming compared to low-resolution (LR) images. Therefore, we propose an image pyramid-based SOD framework, Inverse Saliency Pyramid Reconstruction Network (InSPyReNet), for HR prediction without any of HR datasets. We design InSPyReNet to produce a strict image pyramid structure of saliency map, which enables to ensemble multiple results with pyramid-based image blending. For HR prediction, we design a pyramid blending method which synthesizes two different image pyramids from a pair of LR and HR scale from the same image to overcome effective receptive field (ERF) discrepancy. Our extensive evaluation on public LR and HR SOD benchmarks demonstrates that InSPyReNet surpasses the State-of-the-Art (SotA) methods on various SOD metrics and boundary accuracy.

News
Demo
Applications
Easy Download
Getting Started
Model Zoo
Results
- Quantitative Results
- Qualitative Results
Citation
Acknowledgement
- Special Thanks to
References

News :newspaper:

[2022.10.04] TasksWithCode mentioned our work in Blog and reproducing our work on Colab. Thank you for your attention!

[2022.10.20] We trained our model on Dichotomous Image Segmentation dataset (DIS5K) and showed competitive results! Trained checkpoint and pre-computed segmentation masks are available in Model Zoo). Also, you can check our qualitative and quantitative results in Results section.

[2022.10.28] Multi GPU training for latest pytorch is now available.

[2022.10.31] TasksWithCode provided an amazing web demo with HuggingFace. Visit the WepApp and try with your image!

[2022.11.09] :car: Lane segmentation for driving scene built based on InSPyReNet is available in LaneSOD repository.

[2022.11.18] I am speaking at The 16th Asian Conference on Computer Vision (ACCV2022). Please check out my talk if you're attending the event! #ACCV2022 #Macau - via #Whova event app

[2022.11.23] We made our work available on pypi package. Please visit transparent-background to download our tool and try on your machine. It works as command-line tool and python API.

[2023.01.18] rsreetech shared a tutorial for our pypi package transparent-background using colab. :tv: [Youtube]

Demo :rocket:

Image Sample | Video Sample :-:|:-: <img src=./figures/demo_image.gif height=200px> | <img src=./figures/demo_video.gif height=200px>

Applications :video_game:

Here are some applications/extensions of our work.

Web Application <img src=https://huggingface.co/front/assets/huggingface_logo-noborder.svg height="20px" width="20px">

Server Down <s>TasksWithCode provided WepApp on HuggingFace to generate your own results!</s>

gokaygokay provided huggingface space for our project WepApp. This works with recent work connecting HuggingFace spaces to MCP server 👍

Command-line Tool / Python API :pager:

Try using our model as command-line tool or python API. More details about how to use is available on [transparent-background](https://

InSPyReNet

Install / Use

README