Improving Underwater Visual Tracking With a Large Scale Dataset and Image Enhancement

This repository is the official implementation of our work on Improving Underwater Visual Tracking With a Large Scale Dataset and Image Enhancement.

<p align="center"> <a href="https://arxiv.org/abs/2308.15816"><img src="https://img.shields.io/badge/arXiv-Paper_Link-blue"></a> <a href="./README.md/#our-dataset-uvot400"><img src="https://img.shields.io/badge/UVOT400%20Dataset-green"></a> <a href="https://drive.google.com/drive/folders/1CVXIOe3C-lp6036dq9QPtjIPuVyOieOZ?usp=sharing"><img src="https://img.shields.io/badge/Thumbnail-orange"></a> <a href="https://eval.ai/web/challenges/challenge-page/2268"><img src="https://img.shields.io/badge/UVOT400_Evaluation%20Server-g"></a> </p> <p align="center"> <a href="./1_Custom_Benchmarking_README.md"><img src="https://img.shields.io/badge/Benchmarking_Trackers_on_Custom_Videos-purple"></a> <a href="./README.md/#citation"><img src="https://img.shields.io/badge/Paper_Citation-pink"></a> </p> <table align="center" border="0"> <tr> <th><div align="center"> <img src="images/Video2.gif" width="300px" /> <p>Turtle.</p> </div></th> <th><div align="center"> <img src="images/Video5.gif" width="300px" /> <p>Diver.</p> </div></th> </tr> </table>

Tracking Sample

Major Updates:

Feb, 2026: Natural language descriptions released here.
Jan, 2026: Paper published in Neurocomputing.
Nov, 2025: Full test set annotation now available for ease of testing.
Apr, 2024: UVOT400 evaluation server goes live see here.

<details> <summary>Show more updates</summary>

Nov. 16, 2023: Added two more trackers (ARTrack, MAT)here.
Oct. 19, 2023: Added two more trackers (SLT-Track and DropTrack).
Oct. 16, 2023: Added three more trackers (MixFormer, MixFormerV2, and AiATrack).
Oct. 12, 2023: Added two new trackers (SimTrack and GRM).
Oct. 05, 2023: Quickly Benchmark SOTA trackers on your custom videos.
Aug. 31, 2023: ArXiv Link to paper provided.
Aug. 07, 2023: Repository made public.
June 30, 2023: Dataset (Train and Test Set link available)

</details>

Our Main Paper Contributions

A large and diverse high-quality UVOT400 benchmark dataset is presented, consisting of 400 sequences and 275,000 manually annotated bounding-box frames, introducing 17 distinct tracking attributes with diverse underwater creatures as targets.
A large-scale benchmarking of 24 recent SOTA trackers is performed on the proposed dataset, adopting established performance metrics.
An UWIE-TR algorithm is introduced. It improves the UVOT performance of SOTA open-air trackers on underwater sequences.
The selected SOTA trackers are re-trained on the enhanced version of the proposed dataset resulting in significant performance improvement across all compared trackers.

Our Dataset: UVOT400

Details about the data collection, annotations, domain-specific tracking attributes can be found in our paper.

Links to Datasets

Our UVOT400 dataset:
- Train Set: Download link
- Test Set (First frame annotation only): Download link
- Test Set (Full): Download link
- Attributes file: Download link
- Language descriptions: Human Labeled, MarineGPT Descriptions. Credit

NOTE: You may use our evaluation server to evaluate your tracker results.

Our Previous UTB180 Dataset: Kaggle Link.

UVOT400 Evaluation Server

To evaluate your tracker on our dataset, please click <a href="https://eval.ai/web/challenges/challenge-page/2268" target='blank'>here</a>.

NOTE: Both train and test split evaluations are available.

Evaluated Trackers

We have utilized several SOTA trackers for the several experiments we have performed. Links to the github repositories of the trackers are as below (click on the tracker name to go to the github page):

Discriminative Correlation Filter-based Trackers:
- ATOM, DiMP, KYS, PrDiMP, ARDiMP
Deep Siamese Trackers
- SiamFC, SiamRPN, SiamMask, SiamCAR, SiamBAN, SiamGAT, SiamAttn, RBO-SiamRPN++, , KeepTrack
Transformer-driven Trackers
- TrSiam, TrDiMP, STMTrack, TrTr, TransT, Stark, ToMP, RTS, CSWinTT, SparseTT, AutoMatch

For our work, we have pulled the trackers from their respective github repositories.

Experiment Environment Setup

Create the python environment

conda create -y --name uvot400 python==3.7.16
conda activate uvot400

Install pytorch and torchvision

pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

Install other packages

pip install -r requirements.txt

Build region (for Pysot library)

python setup.py build_ext --inplace

Experiments

For our experiments, we have utilized the success, precision, and normalized precision VOT tracker evaluation metrics. For comparison with GOT10k open-air dataset, the average overlap (AO), success rate 0.50, and 0.75 are utilized.

Benchmarking SOTA Trackers on Custom Videos

This repository also allows you to quickly benchmark SOTA trackers on your custom videos see here.

Aknowledgements

Thanks to the authors of the trackers for providing the implementations.
Thanks to the Pysot and Pytracking libraries for providing the tracking evaluation codes.
We aknowledge the use of eval.ai for creating the evaluation server.
This work acknowledges the support provided by the Khalifa University of Science and Technology under Faculty Start-Up grants FSU-2022-003 Award No. 8474000401.

Citation

If you find our work useful for your research, please consider citing:

@article{Alawode2026_uvot1900,
title = {Underwater visual tracking with a large scale dataset and image enhancement},
author = {Basit Alawode and Sajid Javed and Fayaz Ali Dharejo and Mehnaz Ummar and Arif Mahmoud and Fahad Shahbaz Khan and Jiri Matas}
journal = {Neurocomputing},
pages = {132586},
year = {2026},
issn = {0925-2312},
doi = {https://doi.org/10.1016/j.neucom.2025.132586},
url = {https://www.sciencedirect.com/science/article/pii/S0925231225032588},
}

@article{Alawode2023,
archivePrefix = {arXiv},
arxivId = {2308.15816},
author = {Alawode, Basit and Dharejo, Fayaz Ali and Ummar, Mehnaz and Guo, Yuhang and Mahmood, Arif and Werghi, Naoufel and Khan, Fahad Shahbaz and Matas, Jiri and Javed, Sajid},
eprint = {2308.15816},
title = {{Improving Underwater Visual Tracking With a Large Scale Dataset and Image Enhancement}},
url = {http://arxiv.org/abs/2308.15816},
volume = {14},
year = {2023}
}

@inproceedings{alawode2022utb180,
  title={UTB180: A High-quality Benchmark for Underwater Tracking},
  author={Alawode, Basit and Guo, Yuhang and Ummar, Mehnaz and Werghi, Naoufel and Dias, Jorge and Mian, Ajmal and Javed, Sajid},
  booktitle={{ACCV}},
  year={2022}
}

Natural Language Description Credit

@INPROCEEDINGS{michael2024,
  author={Michael, Yonathan and Alansari, Mohamad and Javed, Sajid},
  booktitle={2024 IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)}, 
  title={Text-Guided Multi-Modal Fusion for Underwater Visual Tracking}, 
  year={2024},
  volume={},
  number={},
  pages={1-6},
  doi={10.1109/AVSS61716.2024.10672591}}

[x] Benchmarking trackers on custom videos here
[x] Provide dataset thumbnails here.
[x] UVOT400

UVOT400

Install / Use

README