ULGF
ULGF: A Diffusion Model-Based Image Generation Framework for Underwater Object Detection
Install / Use
/learn @maxiaoha666/ULGFREADME
ULGF: A Diffusion Model-Based Image Generation Framework for Underwater Object Detection
Yaoming Zhuang,Longyu Ma, Jiaming Liu, Yonghao Xian,Baoquan Chen,Li Li, Chengdong Wu, Wei Cui, Zhanlin Liu

Installation
Clone this repo and create the ULGF environment with conda. We test the code under python==3.7.16, pytorch==1.12.1, cuda=11.3 on RTX 3090 GPU servers. Other versions might be available as well.
1. Deploy Conda environment
conda create -n ULGF python==3.7
2. Install package dependencies
pip install -r requirements.txt
Download Pre-trained Models
We use Stable Diffusion V1.5 as a pre-trained model to train our model
Quick Start
1. Prepare dataset
The dataset we used is provided by RUOD, and the downloaded dataset labels should be converted to the COCO format.
2. Train ULGF
We use a single RTX3090 to complete the training task. We hope you can reserve more than 24GB of space for training.You need to introduce different loss functions in line 34 of train_UWLGM.py.
# RUOD
bash tools/dist_train.sh \
--dataset_config_name configs/data/ruod_256x256.py \
--output_dir work_dirs/ruod_256x256
3. Inference
ULGF supports generating diverse underwater images, and we introduce fixed random seeds in inference to ensure controllable results (which is not necessary). The weights we have trained can be downloaded here.In addition, if you want to generate underwater images with custom styles, please refer to our annotated dataset.You need to modify the baseline used for inference on line 81 of generation_utils.py.
# StableDiffusionPipeline
python run_dataset_expansion.py
# PriorDiffusionPipeline
python prior_probability_distribution/run_prior_expansion.py
Evaluation Method
We mainly use mAP and FID to evaluate ULGF.
1. mAP
We use the ultralytics toolbox to test the improvement of ULGF in underwater target detection tasks and the layout accuracy of ULGF.We use the ULGF expanded dataset to train different detectors. We use the detection model trained by RUOD to verify the layout accuracy of the generated images.We provide partially trained object detection models.
from ultralytics import YOLO
model = YOLO("your model") # load a custom model
metrics = model.val(data="your yaml")
2. FID
All images for testing are uniformly scaled to a size of 256. The relevant dataset can be viewed here.
python tools/FIDValue.py
Citation
Announce later
Acknowledgement
We adopt the following open-sourced projects:
- diffusers: basic codebase to train Stable Diffusion models.
- GeoDiffusion: Training layout guided diffusion model
- SCP-Diff Prior information extraction
Related Skills
qqbot-channel
344.4kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
docs-writer
99.9k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
344.4kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
project-overview
FlightPHP Skeleton Project Instructions This document provides guidelines and best practices for structuring and developing a project using the FlightPHP framework. Instructions for AI Coding A
