CoSOD3K (CVPR2020) <a name="headin"></a>

'Taking a Deeper Look at Co-Salient Object Detection'

CoSOD3K (CVPR2020) <a name="headin"></a>

Abstract

Co-salient object detection (CoSOD) is a newly emerging and rapidly growing branch of salient object detection (SOD), which aims to detect the co-occurring salient objects in multiple images. However, existing CoSOD datasets often have a serious data bias, which assumes that each group of images contains salient objects of similar visual appearances. This bias results in the ideal settings and the effectiveness of the models, trained on existing datasets, may be impaired in real-life situations, where the similarity is usually semantic or conceptual. To tackle this issue, we first collect a new high-quality dataset, named CoSOD3k, which contains 3,316 images divided in 160 groups with multiple level annotations, i.e., category, bounding box, object, and instance levels. CoSOD3k makes a significant leap in terms of diversity, difficulty and scalability, benefiting related vision tasks. Besides, we comprehensively summarize 34 cutting-edge algorithms, benchmarking 19 of them over four existing CoSOD datasets (MSRC, iCoSeg, Image Pair and CoSal2015) and our CoSOD3k with a total of ∼61K images (largest scale), and reporting group-level performance analysis. Finally, we discuss the challenge and future work of CoSOD. Our study would give a strong boost to growth in the CoSOD community.

CoSOD Dataset Comparision

<img src="figures/CoSOD3K.png" width="100%"/> Figure 1: Different salient object detection (SOD) tasks. (a) Traditional SOD [78]. (b) Within image co-salient object detection (CoSOD) [93], where common salient objects are detected from a single image. (c) Existing CoSOD, where salient objects are detected according to a pair [52] or a group [85] of images with similar appearances. (d) The proposed CoSOD in the wild, which requires a large amount of semantic context, making it more challenging than existing CoSOD. <img src="figures/CoSOD3k2.png" width="100%"/> Figure 2: The 160 Objects from our CoSOD3k.

Statistics

<img src="figures/CoSOD3k-statistic.png" width="60%"/> Table 1: Statistics for size and number of instances/objects in existing datasets.’-’ indicates that the dataset only contains object-level annotations, so, the number of instances is only one.

Downloads

SOTA Models

News

| Model | Pub. | Year | #Training | Training set | Main Component | SL. | Sp. | Po. | Ed. | Post. | | :--------------: | :-----: | :--: | :------------------: | :-------------------------------: | :----------------------------------------: | :--: | :--: | :--: | :--: | :---: | | WPLT | UIST | 2010 | | | Morphological, Translational Alignment | U | | | | | | PCSDT | ICIP | 2010 | 120,000 | 8*8 image patch | Sparse feature, Filter Bank | W | | | | | | IPCST | TIP | 2011 | | | Ncut, co-multilayer Graph | U | √ | | | | | CBCST | TIP | 2013 | | | Contrast/Spatial/Corresponding Cue | U | | | | | | MIT | TMM | 2013 | | | Feature/Images Pyramid, Multi-scale Voting | U | √ | | | GCut | | CSHST | SPL | 2013 | | | Hierarchical Segmentation, Contour Map | U | | | √ | | | ESMGT | SPL | 2014 | | | Efficient Manifold Ranking 184], OTSU | U | | | | | | BRT | MM | 2014 | | | Common/Center Cue, Global Correspondence | U | √ | | | | | SACST | TIP | 2014 | | | Self-adaptive Weight, Low Rank Matrix | U | √ | | | | | DIM | TNNLS | 2015 | 1,000+9,963 | ASD+PV | SDAE model, Contrast/Object Prior | S | √ | | | | | CODW | IJCV | 2016 | | ImageNet pre-train | SermaNet, RBM, IMC, IGS, IGC | W | √ | √ | | | | SP-MIL | TPAMI | 2017 | (240+643)•10% | MSRC-V1+iCoseg | SPL 1971, SVM, GIST 1691, CNNs | W | √ | | | | | GD | IJCAI | 2017 | 9,213 | MSCOCO | VGGNet16 [681, Group-wise Feature | S | | | | | | MVSRCC | TIP | 2017 | | | LBP, SIFT [611, CH, Bipartite Graph | | √ | √ | | | | UMLF | TCSVT | 2017 | (240+2015)*50% | MSRC-V1 + CoSa12015 | SVM, OMR 186], metric teaming | S | √ | | | | | DML | BMVC | 2018 | 10,000+6,232+ 5,168 | MIOK+THUR-15K 1111 +DO | CAE, HSR, Multistage | S | | | | | | DWSI | AAAI | 2018 | | | EdgeBox [106], Low-rank Matrix, CH | S | | √ | | | | GONet | ECCV | 2018 | | ImageNet pre-train | ResNet-50 [281, Graphical Optimization | W | √ | | | CRF | | COC | IJCAI | 2018 | | ImageNet pre-train | ResNet-50 [281, Co-attention Loss | W | | √ | | CRF | | FASS | MM | 2018 | | ImageNet pre-train | DHS 156]/VGGNet. Graph optimization | W | √ | | | | | PJOT | TIP | 2018 | | | Energy Minimization, BoWs | U | √ | | | | | SPIG | TIP | 2018 | 10,000+210+2,015+240 | MIOK+IPCS+CoSal2015+ MSRC-V | DeepLab, Graph Representation | S | √ | | | | | QGF | TMM | 2018 | | ImageNet pre-traln | Dense Correspondence, Quality Measure | S | √ | | | THR | | EHL | NC | 2019 | 643 | iCoseg | GoogLeNet, FSM | S | √ | | | | | IML | NC | 2019 | 3624 | CoSa12015+PV+CR | VGGNet16 | S | √ | | | | | DGFC | TIP | 2019 | >200,000 | MSCOCO 1551 | VGGNet16, Group-wise Feature | S | √ | | | | | RCANet | IJCAI | 2019 | >200,000 | MSCOCO+COS+iCoseg+ CoSa12015+MSRC | VGGNet16, Recurrent Units | S | | | | THR | | GS | AAAI | 2019 | 200,000 | COCO-SEG | VGGNet19, Co-category Classification | S | | | |

CoSOD3K

Install / Use

README