DIIF

[ICME 2024] DIIF (Dynamic Implicit Image Function for Efficient Arbitrary-Scale Super-Resolution).

Generate Convert Improve

Install / Use

/learn @HeZongyao/DIIF

About this skill

Quality Score

0/100

README

<div align="center"> <h2>Dynamic Implicit Image Function for Efficient Arbitrary-Scale Super-Resolution</h2> <br> <a href="https://github.com/HeZongyao">Zongyao He</a><sup><span>1</span></sup>, <a href="https://ise.sysu.edu.cn/teacher/teacher02/1384977.htm">Zhi Jin</a><sup><span>1,Corresponding author</span></sup>

<sup>1</sup> SUN YAT-SEN University <br>

<div>

</div> </div>

Introduction

This repository contains the official PyTorch implementation for the ICME 2024 paper titled "Dynamic Implicit Image Function for Efficient Arbitrary-Scale Super-Resolution" by Zongyao He and Zhi Jin.

Qualitative and efficiency (320 × 180 input) comparison for ASSR

</div> <div align="center"> <img src="assets/framework.png" alt="Framework" /> <br>

Framework of DIIF

</div>

Abstract

Implicit Neural Representation (INR)-based methods have achieved remarkable success in Arbitrary-Scale Super Resolution (ASSR). However, these continuous image representations, where pixel values in a continuous spatial domain are inferred from a decoder, suffer from rapidly increasing computational cost as the scale factor increases.

To address this challenge, we propose a Dynamic Implicit Image Function (DIIF) for efficient ASSR. Instead of independently using each image coordinate and its nearby 2D features as decoder inputs, DIIF introduces a coordinate grouping and slicing strategy to decode pixel value slices from coordinate slices. To perform efficient arbitrary-scale decoding, we further introduce a dynamic coordinate slicing strategy empowered by our Coarse-to-Fine MLP (C2F-MLP), which allows adjusting the number of coordinates in each slice as the scale factor varies.

Extensive experiments demonstrate that DIIF can seamlessly integrate with INR-based ASSR methods, significantly reducing computational cost and runtime, while maintaining State-Of-The-Art (SOTA) SR performance.

Train & Test

Train EDSR-baseline-DLIIF and RDN-DLIIF (small model):

python train.py --config options/train/train_edsr-dliif-s.json
python train.py --config options/train/train_rdn-dliif-s.json

Train EDSR-baseline-DLIIF and RDN-DLIIF (medium model):

python train.py --config options/train/train_edsr-dliif-m.json
python train.py --config options/train/train_rdn-dliif-m.json

Test EDSR-DLIIF and RDN-DLIIF (small model):

python test.py --config options/test/test_edsr-dliif-s.json
python test.py --config options/test/test_rdn-dliif-s.json

Test EDSR-DLIIF and RDN-DLIIF (medium model):

python test.py --config options/test/test_edsr-dliif-m.json
python test.py --config options/test/test_rdn-dliif-m.json

Acknowledgement

This work was supported by Frontier Vision Lab, SUN YAT-SEN University.

Special acknowledgment goes to the following projects: LIIF and LTE.

Citation

If you find this work helpful, please consider citing:

@misc{he2023dynamic,
      title={Dynamic Implicit Image Function for Efficient Arbitrary-Scale Image Representation}, 
      author={Zongyao He and Zhi Jin},
      year={2023},
      eprint={2306.12321},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Feel free to reach out for any questions or issues related to the code. Thank you for your interest!

Related Skills

node-connect

344.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

96.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。