USRNet

Deep Unfolding Network for Image Super-Resolution (CVPR, 2020) (PyTorch)

Generate Convert Improve

Install / Use

/learn @cszn/USRNet

About this skill

Quality Score

0/100

README

Deep unfolding network for image super-resolution

Kai Zhang, Luc Van Gool, Radu Timofte
Computer Vision Lab, ETH Zurich, Switzerland

[Paper][Code]

[Training code --> KAIR]

git clone https://github.com/cszn/KAIR.git

Training with DataParallel - PSNR

python main_train_psnr.py --opt options/train_usrnet.json

Training with DistributedDataParallel - PSNR - 4 GPUs

python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_psnr.py --opt options/train_usrnet.json  --dist True

Classical SISR degradation model
Motivation
Unfolding algorithm
Deep unfolding SR network
Models
Codes
Blur kernels
Approximated bicubic kernel under classical SR degradation model assumption
PSNR results
Visual results of USRNet
Visual results of USRGAN
Results for bicubic degradation
Results for deblurring
Generalizability
Real image SR
Citation

Classical SISR degradation model

For a scale factor of $\mathbf{s}$, the classical (traditional) degradation model of SISR assumes the low-resolution (LR) image $\mathbf{y}$ is a blurred, decimated, and noisy version of a high-resolution (HR) image $\mathbf{x}$. Mathematically, it can be expressed by

$$\mathbf{y}=\left(\mathbf{x}\otimes\mathbf{k}\right)\downarrow_{\mathrm{{s}}}+\mathbf{n}$$

where $\otimes$ represents two-dimensional convolution of $\mathbf{x}$ with blur kernel $\mathbf{k}$, $\downarrow_{\mathrm{{s}}}$ denotes the standard $\mathbf{s}$-fold downsampler, i.e., keeping the upper-left pixel for each distinct $\mathbf{s}\times \mathbf{s}$ patch and discarding the others, and n is usually assumed to be additive, white Gaussian noise (AWGN) specified by standard deviation (or noise level) $\mathbf{\sigma}$. With a clear physical meaning, it can approximate a variety of LR images by setting proper blur kernels, scale factors and noises for underlying HR images. In particular, it has been extensively studied in model-based methods which solve a combination of a data term and a prior term under the MAP framework. Especially noteworthy is that it turns into a special case for deblurring when $\mathbf{s} = 1$.

Motivation

Learning-based single image super-resolution (SISR) methods are continuously showing superior effectiveness and efficiency over traditional model-based methods, largely due to the end-to-end training. However, different from model-based methods that can handle the SISR problem with different scale factors, blur kernels and noise levels under a unified MAP (maximum a posteriori) framework, learning-based methods (e.g., SRMD [3]) generally lack such flexibility.

[1] "Learning deep CNN denoiser prior for image restoration." CVPR, 2017.
[2] "Deep plug-and-play super-resolution for arbitrary blur kernels." CVPR, 2019.
[3] "Learning a single convolutional super-resolution network for multiple degradations." CVPR, 2018.

While the classical degradation model can result in various LR images for an HR image, with different blur kernels, scale factors and noise, the study of learning a single end-to-end trained deep model to invert all such LR images to HR image is still lacking.

This work focuses on non-blind SISR which assumes the LR image, scale factor, blur kernel and noise level are known beforehand. In fact, non-blind SISR is still an active research direction.

First, the blur kernel and noise level can be estimated, or are known based on other information (e.g., camera setting).
Second, users can control the preference of sharpness and smoothness by tuning the blur kernel and noise level.
Third, non-blind SISR can be an intermediate step towards solving blind SISR.

Unfolding algorithm

By unfolding the MAP inference via a half-quadratic splitting algorithm, a fixed number of iterations consisting of alternately solving a data subproblem and a prior subproblem can be obtained.

#TODO

Deep unfolding SR network

We propose an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods. USRNet inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model, while maintaining the advantages of learning-based methods.

The overall architecture of the proposed USRNet with 8 iterations. USRNet can flexibly handle the classical degradation via a single model as it takes the LR image, scale factor, blur kernel and noise level as input. Specifically, USRNet consists of three main modules, including the data module D that makes HR estimation clearer, the prior module P that makes HR estimation cleaner, and the hyper-parameter module H that controls the outputs of D and P.

Data module D: closed-form solution for the data term; contains no trainable parameters
Prior module P: ResUNet denoiser for the prior term
Hyper-parameter module H: MLP for the hyper-parameter; acts as a slide bar to control the outputs of D and P

Models

|Model|# iters|# params|ResUNet| |---|:--:|:---:|:---:| |USRNet | 8 | 17.02M |64-128-256-512| |USRGAN | 8 | 17.02M |64-128-256-512| |USRNet-tiny| 6 | 0.59M |16-32-64-64 | |USRGAN-tiny| 6 | 0.59M |16-32-64-64 |

Codes

main_test_table1.py: Code to produce the results in Table 1
main_test_bicubic.py: Code to super-resolve LR images by bicubic degradation and produce the results in Table 2
main_test_realapplication.py: Code to super-resolve real LR images

Blur kernels

While it has been pointed out that anisotropic Gaussian kernels are enough for SISR task, the SISR method that can handle more complex blur kernels would be a preferred choice in real applications.

Approximated bicubic kernel under classical SR degradation model assumption

The bicubic degradation can be approximated by setting a proper blur kernel for the classical degradation. Note that the bicubic kernels contain negative values.

PSNR results

Run main_test_table1.py to produce the following results.

<img src="figs/psnr.png" width="900px"/> The table shows the average PSNR(dB) results of different methods for different combinations of scale factors, blur kernels and noise levels.

Visual results of USRNet

(a) LR images with scale factors 2, 3 and 4

(b) Results by the single USRNet model with s = 2, 3 and 4

Visual results of USRGAN

(a) LR images

(b) Results by USRGAN(x4)

|<img align="center" src="figs/test_57_x4_k1_LR.png" width="448px"/> | <img align="center" src="figs/test_57_x4_k1_usrgan.png" width="448px"

Related Skills

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

mentoring-juniors

Community-contributed instructions, agents, skills, and configurations to help you make the most of GitHub Copilot.

groundhog

399

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

isf-agent

a repo for an agent that helps researchers apply for isf funding

cszn

View profile

View on GitHub

GitHub Stars908

CategoryEducation

Updated2d ago

Forks118

cszn/USRNet

Languages

Python

Security Score

100/100

Audited on Mar 24, 2026

No findings