This folder includes:

CholecT45 dataset:
- data: 45 cholecystectomy videos
- triplet: triplet annotations on 45 videos
- instrument: tool annotations on 45 videos
- verb: action annotations on 45 videos
- target: target annotations on 45 videos
- dict: id-to-name mapping files
- a LICENCE file
- a README file

<details> <summary> Expand this to visualize the dataset directory structure. </summary>

  ──CholecT45
      ├───data
      │   ├───VID01
      │   │   ├───000000.png
      │   │   ├───000001.png
      │   │   ├───000002.png
      │   │   ├───
      │   │   └───N.png
      │   ├───VID02
      │   │   ├───000000.png
      │   │   ├───000001.png
      │   │   ├───000002.png
      │   │   ├───
      │   │   └───N.png
      │   ├───
      │   ├───
      │   ├───
      │   |
      │   └───VIDN
      │       ├───000000.png
      │       ├───000001.png
      │       ├───000002.png
      │       ├───
      │       └───N.png
      |
      ├───triplet
      │   ├───VID01.txt
      │   ├───VID02.txt
      │   ├───
      │   └───VIDNN.txt
      |
      ├───instrument
      │   ├───VID01.txt
      │   ├───VID02.txt
      │   ├───
      │   └───VIDNN.txt
      |
      ├───verb
      │   ├───VID01.txt
      │   ├───VID02.txt
      │   ├───
      │   └───VIDNN.txt
      |
      ├───target
      │   ├───VID01.txt
      │   ├───VID02.txt
      │   ├───
      │   └───VIDNN.txt
      |
      ├───dict
      │   ├───triplet.txt
      │   ├───instrument.txt
      │   ├───verb.txt
      │   ├───target.txt
      │   └───maps.txt
      |
      ├───LICENSE
      └───README.md

</details>

The superset and a more complete version of the dataset, CholecT50, is now available here.

News and Updates:

[ 17/09/2025 ]: Check out our CAMMA Dataset Overlaps repository for an analysis of video overlaps across Cholec80, CholecT50, and Endoscapes to ensure fair dataset splits.
[ 29/04/2022 ]: Added PyTorch dataloader for the dataset.
[ 02/05/2022 ]: Added TensorFlow v2 dataloader for the dataset.

Download Dataset

The CholecT45 dataset has been officially released for public use on April 12, 2022. If you wish to have access to this dataset, please kindly fill the request form.

DATASET DESCRIPTION

The CholecT45 dataset contains 45 videos of cholecystectomy procedures collected in Strasbourg, France. It is a subset of Cholec80 [1] dataset. CholecT45 is an extension of CholecT40 [2] with additional videos and standardized annotations. The images are extracted at 1 fps from the videos and annotated with triplet information about surgical actions in the format of <instrument, verb, target>. In total, there are 90489 frames and 127385 triplet instances in the dataset. To ensure anonymity, frames corresponding to out-of-body views are entirely blacked (RGB 0 0 0) out.

Triplet Annotations

Each triplet annotation file contains a table, consisting of 101 columns. Every row contains an annotation for an image in the video. The first column indicates the frame index of the annotated image in the video. The frame index is defined under a 0-based system. The other 100 columns are the binary labels for the triplets (0=not present; 1=present). This last 100 columns sequentially correspond to the triplets IDs (0..99) and names as contained in the mapping file (dict/triplet.txt)

For simplicity, we also provide annotations for the various components of the triplets: instrument, verb and target.

Instrument Annotations

Each instrument annotation file contains a table, consisting of 7 columns. Every row contains an annotation for an image in the video. The first column indicates the frame index of the annotated image in the video. The frame index is defined under a 0-based system. The other 6 columns are the binary labels for the instrument (0=not present; 1=present). This last 6 columns sequentially correspond to the instrument IDs (0..5) and names as contained in the mapping file (dict/instrument.txt)

Verb Annotations

Each verb annotation file contains a table, consisting of 11 columns. Every row contains an annotation for an image in the video. The first column indicates the frame index of the annotated image in the video. The frame index is defined under a 0-based system. The other 10 columns are the binary labels for the verb (0=not present; 1=present). This last 10 columns sequentially correspond to the verb IDs (0..9) and names as contained in the mapping file (dict/verb.txt)

Target Annotations

Each target annotation file contains a table, consisting of 16 columns. Every row contains an annotation for an image in the video. The first column indicates the frame index of the annotated image in the video. The frame index is defined under a 0-based system. The other 15 columns are the binary labels for the target (0=not present; 1=present). This last 15 columns sequentially correspond to the target IDs (0..14) and names as contained in the mapping file (dict/target.txt)

Dict

The dict folder contains mapping of the label ID to full name for various tasks viz-a-viz: triplet, instrument, verb, and target. Specifically, the maps.txt file contains a table, consisting of 6 columns for mapping triplet IDs to their component IDs. This is useful for decomposing a triplet to its constituting components. The first column indicates the triplet ID (that is instrument-verb-target paring IDs). The second column indicates the instrument ID. The third column indicates the verb IDs. The fourth column indicates the target IDs. The fifth column indicates the instrument-verb pairing IDs. The sixth column indicates the instrument-target pairing IDs.

Example usage: The first row in the maps.txt shows: 1,0,2,0,2,0 This means that triplet iD 1 can be mapped to <0, 2, 0> which is {grasper, dissect, gallbladder}.

License and References

This dataset could only be generated thanks to the continuous support from our surgical partners. In order to properly credit the authors and clinicians for their efforts, you are kindly requested to cite the work that led to the generation of this dataset:

C.I. Nwoye, T. Yu, C. Gonzalez, B. Seeliger, P. Mascagni, D. Mutter, J. Marescaux, N. Padoy. Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos. Medical Image Analysis, 78 (2022) 102433.

The cholecT45 dataset is publicly released under the Creative Commons license CC BY-NC-SA 4.0 LICENSE. This implies that:

the dataset cannot be used for commercial purposes,
the dataset can be transformed (additional annotations, etc.),
the dataset can be redistributed as long as it is redistributed under the same license with the obligation to cite the contributing work which led to the generation of the cholecT45 dataset (mentioned above).

By downloading and using this dataset, you agree on these terms and conditions.

Dataset Splits and Baselines

The official splits of the dataset for deep learning models is provided in the paper:

C.I. Nwoye, N. Padoy. Data Splits and Metrics for Benchmarking Methods on Surgical Action Triplet Datasets. arXiv PrePrint 2022.

The paper provides extended experiments on the baseline methods using the official dataset splits.

Fig. 1: Cross-validation experiment schedule for CholecT50. For CholecT45, remove the last video in each fold.

Data Loader

We provide data loader for the following frameworks:

PyTorch : dataloader_pth.py
TensorFlow v1 & v2 : [dataloader_tf.py](datal

Cholect45

Install / Use

README