SkillAgentSearch skills...

OpenESS

[CVPR 2024 Highlight] OpenESS: Event-Based Semantic Scene Understanding with Open Vocabularies

Install / Use

/learn @ldkong1205/OpenESS
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <img src="docs/figs/logo.png" align="center" width="18%"> <h1 align="center"><strong>OpenESS: Event-Based Semantic Scene Understanding with Open Vocabularies</strong></h1> <p align="center"> <a href="https://ldkong.com/" target='_blank'>Lingdong Kong</a><sup>1,2</sup>&nbsp;&nbsp;&nbsp; <a href="https://github.com/youquanl" target='_blank'>Youquan Liu</a><sup>3</sup>&nbsp;&nbsp;&nbsp; <a href="https://ipal.cnrs.fr/lai-xing-ng/" target='_blank'>Lai Xing Ng</a><sup>4</sup>&nbsp;&nbsp;&nbsp; <a href="https://ipal.cnrs.fr/benoit-cottereau-personal-page/" target='_blank'>Benoit R. Cottereau</a><sup>5,6</sup>&nbsp;&nbsp;&nbsp; <a href="https://www.comp.nus.edu.sg/cs/people/ooiwt/" target='_blank'>Wei Tsang Ooi</a><sup>1</sup> </br> <sup>1</sup>National University of Singapore&nbsp;&nbsp;&nbsp; <sup>2</sup>CNRS@CREATE&nbsp;&nbsp;&nbsp; <sup>3</sup>Hochschule Bremerhaven&nbsp;&nbsp;&nbsp; <sup>4</sup>Institute for Infocomm Research, A*STAR&nbsp;&nbsp;&nbsp; <sup>5</sup>IPAL, CNRS IRL 2955, Singapore&nbsp;&nbsp;&nbsp; <sup>6</sup>CerCo, CNRS UMR 5549, Universite Toulouse III </p> </p>

About

OpenESS is an open-vocabulary event-based semantic segmentation (ESS) framework that synergizes information from image, text, and event-data domains to enable scalable ESS in an open-world, annotation-efficient manner.

| <img width="173" src="docs/figs/teaser_1.png"> | <img width="173" src="docs/figs/teaser_2.png"> | <img width="173" src="docs/figs/teaser_3.png"> | <img width="173" src="docs/figs/teaser_4.png"> | | :-: | :-: | :-: | :-: | | Input Event Stream | “Driveable” | “Car” | “Manmade” | | <img width="173" src="docs/figs/teaser_5.png"> | <img width="173" src="docs/figs/teaser_6.png"> | <img width="173" src="docs/figs/teaser_7.png"> | <img width="173" src="docs/figs/teaser_8.png"> | | Zero-Shot ESS | “Walkable” | “Barrier” | “Flat” |

Table of Contents

:gear: Installation

Kindly refer to INSTALL.md for the installation details.

:hotsprings: Data Preparation

DSEC-Semantic

  • Step 1: Download the DSEC dataset from the official dataset page. Below, we summarize the links used for downloading each of the resources:

    | Training Data | Link | Size | Description | |:-:|:-:|:-:|:-| | Events | download | 125 GB | The raw event data in .h5 format | | Frames | download | 216 GB | The RGB frames in .png format | | Disparities | download | 12 GB | The disparities between left and right sensors | | Semantic Masks | download | 88.6 MB | The ground truth semantic segmentation labels |

    | Test Data | Link | Size | Description | |:-:|:-:|:-:|:-| | Events | download | 27 GB | The raw event data in .h5 format | | Frames | download | 43 GB | The RGB frames in .png format | | Semantic Masks | download | 28.9 MB | The ground truth semantic segmentation labels |

  • Step 2: Link the dataset to path ./data. Your dataset folder should end up aligning with the following dataset structure:

    ./data/DSEC
            ├── test
            │     ├── zurich_city_13_a
            │     │    ├── events
            │     │    │    └── left
            │     │    │          └── events.h5
            │     │    ├── images
            │     │    │    └── left
            │     │    │          │── 000000.png
            │     │    │          │── ...
            │     │    │          └── 000378.png
            │     │    ├── images_aligned
            │     │    │    └── left
            │     │    │    │── 000000.png
            │     │    │    │── ...
            │     │    │    └── 000378.png
            │     │    ├── reconstructions
            │     │    │    └── left
            │     │    │          │── 000000.png
            │     │    │          │── ...
            │     │    │          └── 000378.png
            │     │    └── semantic
            │     │         └── left
            │     │               │── 000000.png
            │     │               │── ...
            │     │               └── 000378.png
            │     ├── zurich_city_14_c
            │     └── zurich_city_15_a
            └── train
                  ├── zurich_city_00_a
                  ├── zurich_city_01_a
                  ├── zurich_city_02_a
                  ├── zurich_city_04_a
                  ├── zurich_city_05_a
                  ├── zurich_city_06_a
                  ├── zurich_city_07_a
                  └── zurich_city_08_a
    
  • Step 3: Prepare frame data that aligns with the events. Please follow the same procedure from Sun et al. (ESS: Learning Event-based Semantic Segmentation from Still Images), and place the processed frame data into the folder named images_aligned.

    Additionally, we provided our processed DSEC-Semantic frame data at this Google Drive link (~4.95 GB).

  • Step 4: Prepare the zero-shot semantic labels for T2E: Text-to-Event Consistency Regularization. For more details, kindly refer to FC-CLIP.md.

    Additionally, we provided our generated DSEC-Semantic T2E labels at this Google Drive link (~47.5 MB).

  • Step 5: Prepare the event reconstruction data. Please follow the same procedure from Sun et al. (ESS: Learning Event-based Semantic Segmentation from Still Images), and place the processed frame data into the folder named images_aligned.

    The pretrained E2VID model can be downloaded from this link and should be placed under the folder /e2vid/pretrained/.

    Additionally, we provided our processed DSEC-Semantic event reconstruction data at this Google Drive link (~2.41 GB).

  • Step 6: Generate the semantic superpixels of SAM for DSEC-Semantic. You should first download the pretrained SAM model from this link.

    Next, run the following scripts to generate the superpixels:

    # for training set
    python data_preparation/superpixel_generation_dsec_sam.py -r data/DSEC/train
    
    # for test set
    python data_preparation/superpixel_generation_dsec_sam.py -r data/DSEC/test
    

    The generated superpixels should be placed in the folder named sp_sam_rgb.

  • Step 7: Generate the semantic superpixels of SLIC for DSEC-Semantic. You can directly run the following script to generate the superpixels:

    python data_preparation/superpixel_segmenter_dsec_slic.py --worker $WORKER_NUM --num_segments $SEGMENTS_NUM
    

    The generated superpixels should be placed in the folder named sp_slic_rgb.

To summarize, for each of the sequences in DSEC-Semantic, we expect that you prepare the following data for running the experiments:

./sequence_name
    ├── events
    ├── images
    ├── images_aligned
    ├── pl_fcclip_rgb
    ├── reconstructions
    ├── semantic
    ├── sp_sam_rgb
    └── sp_slic_rgb

DDD17-Seg

  • Step 1: Download the DDD17 dataset from the official dataset page and/or from the Ev-SegNet paper.

  • Step 2: Link the dataset to path ./data. Your dataset folder should end up aligning with the following dataset structure:

    ./data/DDD17
            ├── dir0
            │    ├── events.dat.t
            │    ├── events.dat.xyp
            │    ├── index
            │    │     │── index_10ms.npy
            │    │     │── index_50ms.npy
            │    │     └── index_250ms.npy
            │    ├── images
            │    │     │── img_00000002.png
            │    │     │── ...
            │    │     └── img_00011178.png
            │    ├── images_aligned
            │    │     │── img_00000002.png
            │    │     │── ...
            │    │     └── img_00011178.png
            │    ├── reconstructions
            │    │     │── img_00000002.png
            │    │     │── . . .
            │    │     └── img_00011178.png
            │    └── segmentation_masks
            │          │── img_00000002.png
            │          │── . . .
            │          └── img_00011178.png
            ├── dir1
            ├── dir3
            ├── dir4
            ├── dir6
            └── dir7
    
  • Step 3: Prepare frame data that aligns with the events.

View on GitHub
GitHub Stars72
CategoryDevelopment
Updated2mo ago
Forks1

Languages

Python

Security Score

85/100

Audited on Feb 1, 2026

No findings