FlowDrag

[ICML'25 Spotlight] FlowDrag: 3D-aware Drag-based Image Editing with Mesh-guided Deformation Vector Flow Fields

Generate Convert Improve

Install / Use

/learn @kookie12/FlowDrag

About this skill

Quality Score

0/100

README

<h2 align="center">FlowDrag: 3D-aware Drag-based Image Editing with Mesh-guided Deformation Vector Flow Fields</h2> <a href="https://kookie12.github.io/">Gwanhyeong Koo</a>, <a href="https://dbstjswo505.github.io/">Sunjae Yoon</a>, <a href="http://sanctusfactory.com/family_02.php">Younghwan Lee</a>, <a href="https://jiwoohong93.github.io/">Ji Woo Hong</a>, <a href="http://sanctusfactory.com/family.php">Chang D. Yoo</a> KAIST

📢 Release

[07/12/2024] Initial preview release
[12/28/2024] Code and VFD-Bench dataset released

🐶 Introduction

FlowDrag is a 3D-aware drag-based image editing method that leverages mesh-guided deformation vector flow fields. Our approach generates spatially coherent edits by utilizing 3D mesh deformations to guide the flow field.

Key Features

🎯 3D-Aware Editing: Utilizes 3D mesh deformations for spatially coherent edits
🌊 Flow Field Guidance: Generates dense deformation vector fields from sparse user inputs
📊 VFD-Bench Dataset: Comprehensive benchmark for evaluating drag-based editing methods
🚀 Interactive UI: User-friendly Gradio interface for real-time editing

📁 Project Structure

FlowDrag/
├── flowdrag_ui.py              # Main inference UI
├── pipeline.py                 # FlowDrag pipeline implementation
├── bench_flowdrag.py           # Evaluation on VFD-Bench
├── dataset/
│   └── VFD_Bench_Dataset/      # Benchmark dataset
├── mesh_deformation/
│   └── flowdrag_mesh_deform_ui.py  # Mesh deformation UI
├── utils/
│   ├── drag_utils.py           # Core drag editing utilities
│   ├── lora_utils.py           # LoRA training utilities
│   ├── attn_utils.py           # Attention manipulation
│   └── ui_utils.py             # UI helper functions
├── samples/                    # Sample images and results
└── environment_flowdrag.yaml   # Conda environment

💻 Installation

Setup

# Clone the repository
git clone https://github.com/kookie12/FlowDrag.git
cd FlowDrag

# Create conda environment
conda env create -f environment_flowdrag.yaml
conda activate flowdrag

🚀 Usage

Interactive Editing

Launch the Gradio interface for interactive drag-based editing:

python flowdrag_ui.py

The UI will open in your browser where you can:

Upload an image
Place handle points (points to drag) and target points (destinations)
Adjust deformation parameters
Generate edited results

Mesh Deformation UI

For 3D mesh-guided flow field generation (recommended on local machine for Open3D visualization):

cd mesh_deformation
python flowdrag_mesh_deform_ui.py

This generates {sample_name}_vector_field.npy files from input meshes.

📊 Evaluation

VFD-Bench Dataset

Download the VFD-Bench dataset:

Access Form: Google Form

Place the dataset in dataset/VFD_Bench_Dataset/.

LoRA Training

Train LoRA weights for all samples in the VFD-Bench dataset:

# Set paths (optional, defaults provided)
export INPUT_DATASET_PATH="dataset/VFD_Bench_Dataset"
export OUTPUT_LORA_PATH="lora_data/VFD_Bench_Dataset"

# Run training
python evaluation/run_lora_training_vfd_bench.py

LoRA weights will be saved in lora_data/VFD_Bench_Dataset/.

Run Benchmark

Evaluate FlowDrag on the VFD-Bench dataset:

python bench_flowdrag.py

Results will be saved in the VFD_Bench_result_flowdrag/ folder, including:

Edited images
Concatenated visualizations
Quantitative metrics

🙌🏻 Acknowledgments

Our code is built upon the following excellent projects:

We thank the authors for their great work!

📖 Citation

If you find our work useful, please consider citing:

@article{koo2025flowdrag,
  title={Flowdrag: 3d-aware drag-based image editing with mesh-guided deformation vector flow fields},
  author={Koo, Gwanhyeong and Yoon, Sunjae and Lee, Younghwan and Hong, Ji Woo and Yoo, Chang D},
  journal={arXiv preprint arXiv:2507.08285},
  year={2025}
}

Acknowledgement (Funding)

This work was partly supported by Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No. 2021-0-01381, Development of Causal AI through Video Understanding and Reinforcement Learning, and Its Applications to Real Environments) and partly supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2022-0-00184, Development and Study of AI Technologies to Inexpensively Conform to Evolving Policy on Ethics).

Related Skills

node-connect

352.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.3k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。