HINet
No description available
Install / Use
/learn @megvii-model/HINetREADME
HINet: Half Instance Normalization Network for Image Restoration
Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, Chengpeng Chen
Paper: https://arxiv.org/abs/2105.06086
In this paper, we explore the role of Instance Normalization in low-level vision tasks. Specifically, we present a novel block: Half Instance Normalization Block (HIN Block), to boost the performance of image restoration networks. Based on HIN Block, we design a simple and powerful multi-stage network named HINet, which consists of two subnetworks. With the help of HIN Block, HINet surpasses the state-of-the-art (SOTA) on various image restoration tasks. For image denoising, we exceed it 0.11dB and 0.28 dB in PSNR on SIDD dataset, with only 7.5% and 30% of its multiplier-accumulator operations (MACs), 6.8 times and 2.9 times speedup respectively. For image deblurring, we get comparable performance with 22.5% of its MACs and 3.3 times speedup on REDS and GoPro datasets. For image deraining, we exceed it by 0.3 dB in PSNR on the average result of multiple datasets with 1.4 times speedup. With HINet, we won 1st place on the NTIRE 2021 Image Deblurring Challenge - Track2. JPEG Artifacts, with a PSNR of 29.70.
Network Architecture
<img src="figures/pipeline.png" alt="arch" style="zoom:100%;" />News
2022.04.12 Our new work, Simple Baselines for Image Restoration reveals the nonlinear activation functions, e.g. ReLU, GELU, Sigmoid, and etc. are not necessary to achieve SOTA performance. The paper provide a simple baseline, NAFNet: Nonlinear Activation Free Network for Image Restoration tasks, and acheves SOTA performance on Image Denoising and Image Deblurring. The paper and the code are available at https://arxiv.org/abs/2204.04676 / https://github.com/megvii-research/NAFNet respectively.
2021.12.10 Our new work, Revisiting Global Statistics Aggregation for Improving Image Restoration, exceeds the previous SOTA restorers 0.6 dB (GoPro dataset) without re-train the model. It is accomplished by revealing the feature distribution shifts issue from training phase to testing phase. The paper and the code are available at https://arxiv.org/abs/2112.04491 / https://github.com/megvii-research/tlsc respectively.
Installation
This implementation based on BasicSR which is a open source toolbox for image/video restoration tasks.
python 3.6.9
pytorch 1.5.1
cuda 10.1
git clone https://github.com/megvii-model/HINet
cd HINet
pip install -r requirements.txt
python setup.py develop --no_cuda_ext
Quick Start (Single Image Inference)
python basicsr/demo.py -opt options/demo/demo.yml- modified your input and output path
- define network
- pretrained model, it should match the define network.
- for pretrained model, see here
Image Restoration Tasks
Image denoise, deblur, derain.
<details><summary>Image Denoise - SIDD dataset (Click to expand) </summary>-
prepare data
-
mkdir ./datasets/SIDD -
download the [train]( SIDD-Medium sRGB Dataset in https://www.eecs.yorku.ca/~kamel/sidd/dataset.php) set and unzip it. Then move Data (./SIDD_Medium_Srgb/Data) set to ./datasets/SIDD/ . Download val files (ValidationNoisyBlocksSrgb.mat and ValidationGtBlocksSrgb.mat) in ./datasets/SIDD/ .
-
it should be like:
./datasets/SIDD/Data ./datasets/SIDD/ValidationNoisyBlocksSrgb.mat ./datasets/SIDD/ValidationGtBlocksSrgb.mat -
python scripts/data_preparation/sidd.py- crop the train image pairs to 512x512 patches.
-
-
eval
- download [pretrained model](https://drive.google.com/file/d/1Y5YJQVNL0weifE--5us344bLwzBNS_sU/view?usp=sharing, https://drive.google.com/file/d/1CU5z-M90Jc-TAcVpEaFjDCYA09fkubGi/view?usp=sharing) to ./experiments/pretrained_models/HINet-SIDD-0.5x.pth (HINet-SIDD-1x.pth) (we use the output computing directly from hinet to avoid the psnr loss caused by the "round()" operation, which is the same way using in other networks. For SSIM, our results are higher than those of MATLAB, so only PSNR is reported here)
python basicsr/test.py -opt options/test/SIDD/HINet-SIDD-0.5x.yml (HINet-SIDD-1x.yml)
-
train
python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/train.py -opt options/train/SIDD/HINet.yml(HINet_0.5x.yml) --launcher pytorch- data in lmdb format will lose about 0.01 value in PSNR
-
prepare data
-
mkdir ./datasets/GoPro -
download the train set in ./datasets/GoPro/train and test set in ./datasets/GoPro/test (refer to MPRNet)
-
it should be like:
./datasets/ ./datasets/GoPro/ ./datasets/GoPro/train/ ./datasets/GoPro/train/input/ ./datasets/GoPro/train/target/ ./datasets/GoPro/test/ ./datasets/GoPro/test/input/ ./datasets/GoPro/test/target/ -
python scripts/data_preparation/gopro.py- crop the train image pairs to 512x512 patches.
-
-
eval
- download pretrained model to ./experiments/pretrained_models/HINet-GoPro.pth
python basicsr/test.py -opt options/test/GoPro/HINet-GoPro.yml
-
train
python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/train.py -opt options/train/GoPro/HINet.yml --launcher pytorch
-
prepare data
-
mkdir ./datasets/REDS -
download the train / val set from train_blur, train_sharp, val_blur, val_sharp to ./datasets/REDS/ and unzip them.
-
it should be like
./datasets/ ./datasets/REDS/ ./datasets/REDS/val/ ./datasets/REDS/val/val_blur_jpeg/ ./datasets/REDS/val/val_sharp/ ./datasets/REDS/train/ ./datasets/REDS/train/train_blur_jpeg/ ./datasets/REDS/train/train_sharp/ -
python scripts/data_preparation/reds.py- flatten the folders and extract 300 validation images.
-
-
eval
- download pretrained model to ./experiments/pretrained_models/HINet-REDS.pth
python basicsr/test.py -opt options/test/REDS/HINet-REDS.yml
-
train
python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/train.py -opt options/train/REDS/HINet.yml --launcher pytorch
-
prepare data
-
mkdir ./datasets/Rain13k -
download the [train](https://drive.google.com/drive/folders/1H
-
Related Skills
node-connect
350.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
