DPED

Software and pre-trained models for automatic photo quality enhancement using Deep Convolutional Networks

Generate Convert Improve

Install / Use

/learn @aiff22/DPED

About this skill

Quality Score

0/100

README

DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks

1. Overview [Paper] [Project webpage] [Enhancing RAW photos] [Rendering Bokeh Effect]

The provided code implements the paper that presents an end-to-end deep learning approach for translating ordinary photos from smartphones into DSLR-quality images. The learned model can be applied to photos of arbitrary resolution, while the methodology itself is generalized to any type of digital camera. More visual results can be found here.

2. Prerequisites

Python + Pillow, scipy, numpy, imageio packages
TensorFlow 1.x / 2.x + CUDA CuDNN
Nvidia GPU

3. First steps

Download the pre-trained VGG-19 model and put it into vgg_pretrained/ folder
Download DPED dataset (patches for CNN training) and extract it into dped/ folder.
This folder should contain three subolders: sony/, iphone/ and blackberry/

4. Train the model

python train_model.py model=<model>

Obligatory parameters:

model: iphone, blackberry or sony

Optional parameters and their default values:

batch_size: 50 - batch size [smaller values can lead to unstable training] train_size: 30000 - the number of training patches randomly loaded each eval_step iterations eval_step: 1000 - each eval_step iterations the model is saved and the training data is reloaded num_train_iters: 20000 - the number of training iterations learning_rate: 5e-4 - learning rate w_content: 10 - the weight of the content loss w_color: 0.5 - the weight of the color loss w_texture: 1 - the weight of the texture [adversarial] loss w_tv: 2000 - the weight of the total variation loss dped_dir: dped/ - path to the folder with DPED dataset vgg_dir: vgg_pretrained/imagenet-vgg-verydeep-19.mat - path to the pre-trained VGG-19 network

Example:

python train_model.py model=iphone batch_size=50 dped_dir=dped/ w_color=0.7

5. Test the provided pre-trained models

python test_model.py model=<model>

Obligatory parameters:

model: iphone_orig, blackberry_orig or sony_orig

Optional parameters:

test_subset: full,small - all 29 or only 5 test images will be processed resolution: orig,high,medium,small,tiny - the resolution of the test images [orig means original resolution] use_gpu: true,false - run models on GPU or CPU dped_dir: dped/ - path to the folder with DPED dataset

Example:

python test_model.py model=iphone_orig test_subset=full resolution=orig use_gpu=true

6. Test the obtained models

python test_model.py model=<model>

Obligatory parameters:

model: iphone, blackberry or sony

Optional parameters:

test_subset: full,small - all 29 or only 5 test images will be processed iteration: all or <number> - get visual results for all iterations or for the specific iteration,
<number> must be a multiple of eval_step resolution: orig,high,medium,small,tiny - the resolution of the test images [orig means original resolution] use_gpu: true,false - run models on GPU or CPU dped_dir: dped/ - path to the folder with DPED dataset

Example:

python test_model.py model=iphone iteration=13000 test_subset=full resolution=orig use_gpu=true

7. Folder structure

dped/ - the folder with the DPED dataset models/ - logs and models that are saved during the training process models_orig/ - the provided pre-trained models for iphone, sony and blackberry results/ - visual results for small image patches that are saved while training vgg-pretrained/ - the folder with the pre-trained VGG-19 network visual_results/ - processed [enhanced] test images

load_dataset.py - python script that loads training data models.py - architecture of the image enhancement [resnet] and adversarial networks ssim.py - implementation of the ssim score train_model.py - implementation of the training procedure test_model.py - applying the pre-trained models to test images utils.py - auxiliary functions vgg.py - loading the pre-trained vgg-19 network

8. Problems and errors

What if I get an error: "OOM when allocating tensor with shape [...]"?

Your GPU does not have enough memory. If this happens during the training process:

Decrease the size of the training batch [batch_size]. Note however that smaller values can lead to unstable training.

If this happens while testing the models:

Run the model on CPU (set the parameter use_gpu to false). Note that this can take up to 5 minutes per image.
Use cropped images, set the parameter resolution to:

high - center crop of size 1680x1260 pixels medium - center crop of size 1366x1024 pixels small - center crop of size 1024x768 pixels tiny - center crop of size 800x600 pixels

The less resolution is - the smaller part of the image will be processed

9. Citation

@inproceedings{ignatov2017dslr,
  title={DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks},
  author={Ignatov, Andrey and Kobyshev, Nikolay and Timofte, Radu and Vanhoey, Kenneth and Van Gool, Luc},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={3277--3285},
  year={2017}
}

10. Any further questions?

Please contact Andrey Ignatov (andrey.ignatoff@gmail.com) for more information

Related Skills

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

399

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

codebase-to-course

Turn any codebase into a beautiful, interactive single-page HTML course that teaches how the code works to non-technical people. Use this skill whenever someone wants to create an interactive course, tutorial, or educational walkthrough from a codebase or project. Also trigger when users mention 'turn this into a course,' 'explain this codebase interactively,' 'teach this code,' 'interactive tutorial from code,' 'codebase walkthrough,' 'learn from this codebase,' or 'make a course from this project.' This skill produces a stunning, self-contained HTML file with scroll-based navigation, animated visualizations, embedded quizzes, and code-with-plain-English side-by-side translations.

academic-pptx

Use this skill whenever the user wants to create or improve a presentation for an academic context — conference papers, seminar talks, thesis defenses, grant briefings, lab meetings, invited lectures, or any presentation where the audience will evaluate reasoning and evidence. Triggers include: 'conference talk', 'seminar slides', 'thesis defense', 'research presentation', 'academic deck', 'academic presentation'. Also triggers when the user asks to 'make slides' in combination with academic content (e.g., 'make slides for my paper on X', 'create a presentation for my dissertation defense', 'build a deck for my grant proposal'). This skill governs CONTENT and STRUCTURE decisions. For the technical work of creating or editing the .pptx file itself, also read the pptx SKILL.md.

aiff22

View profile

View on GitHub

GitHub Stars1.7k

CategoryEducation

Updated3d ago

Forks369

aiff22/DPED

Languages

Python

Security Score

85/100

Audited on Mar 23, 2026

No findings