Image super-resolution through deep learning

Related tags

Deep Learningsrez
Overview

srez

Image super-resolution through deep learning. This project uses deep learning to upscale 16x16 images by a 4x factor. The resulting 64x64 images display sharp features that are plausible based on the dataset that was used to train the neural net.

Here's an random, non cherry-picked, example of what this network can do. From left to right, the first column is the 16x16 input image, the second one is what you would get from a standard bicubic interpolation, the third is the output generated by the neural net, and on the right is the ground truth.

Example output

As you can see, the network is able to produce a very plausible reconstruction of the original face. As the dataset is mainly composed of well-illuminated faces looking straight ahead, the reconstruction is poorer when the face is at an angle, poorly illuminated, or partially occluded by eyeglasses or hands.

This particular example was produced after training the network for 3 hours on a GTX 1080 GPU, equivalent to 130,000 batches or about 10 epochs.

How it works

In essence the architecture is a DCGAN where the input to the generator network is the 16x16 image rather than a multinomial gaussian distribution.

In addition to that the loss function of the generator has a term that measures the L1 difference between the 16x16 input and downscaled version of the image produced by the generator.

The adversarial term of the loss function ensures the generator produces plausible faces, while the L1 term ensures that those faces resemble the low-res input data. We have found that this L1 term greatly accelerates the convergence of the network during the first batches and also appears to prevent the generator from getting stuck in a poor local solution.

Finally, the generator network relies on ResNet modules as we've found them to train substantially faster than more old-fashioned architectures. The adversarial network is much simpler as the use of ResNet modules did not provide an advantage during our experimentation.

Requirements

You will need Python 3 with Tensorflow, numpy, scipy and moviepy. See requirements.txt for details.

Dataset

After you have the required software above you will also need the Large-scale CelebFaces Attributes (CelebA) Dataset. The model expects the Align&Cropped Images version. Extract all images to a subfolder named dataset. I.e. srez/dataset/lotsoffiles.jpg.

Training the model

Training with default settings: python3 srez_main.py --run train. The script will periodically output an example batch in PNG format onto the srez/train folder, and checkpoint data will be stored in the srez/checkpoint folder.

After the network has trained you can also produce an animation showing the evolution of the output by running python3 srez_main.py --run demo.

About the author

LinkedIn profile of David Garcia.

Owner
David Garcia
Software engineer specialized on GPU architecture and device drivers. A few billion smartphones have shipped with his work.
David Garcia
PyZebrascope - an open-source Python platform for brain-wide neural activity imaging in behaving zebrafish

PyZebrascope - an open-source Python platform for brain-wide neural activity imaging in behaving zebrafish

1 May 31, 2022
An Evaluation of Generative Adversarial Networks for Collaborative Filtering.

An Evaluation of Generative Adversarial Networks for Collaborative Filtering. This repository was developed by Fernando B. Pérez Maurera. Fernando is

Fernando Benjamín PÉREZ MAURERA 0 Jan 19, 2022
mPose3D, a mmWave-based 3D human pose estimation model.

mPose3D, a mmWave-based 3D human pose estimation model.

KylinChen 35 Nov 08, 2022
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

Ryan Murdock has done it again, combining OpenAI's CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

Phil Wang 2.3k Jan 09, 2023
Inference pipeline for our participation in the FeTA challenge 2021.

feta-inference Inference pipeline for our participation in the FeTA challenge 2021. Team name: TRABIT Installation Download the two folders in https:/

Lucas Fidon 2 Apr 13, 2022
《A-CNN: Annularly Convolutional Neural Networks on Point Clouds》(2019)

A-CNN: Annularly Convolutional Neural Networks on Point Clouds Created by Artem Komarichev, Zichun Zhong, Jing Hua from Department of Computer Science

Artёm Komarichev 44 Feb 24, 2022
DC540 hacking challenge 0x00005a.

dc540-0x00005a DC540 hacking challenge 0x00005a. PROMOTIONAL VIDEO - WATCH NOW HERE ON YOUTUBE CRITICAL PART 5A VIDEO - WATCH NOW HERE ON YOUTUBE Prio

Kevin Thomas 3 May 09, 2022
Tutel MoE: An Optimized Mixture-of-Experts Implementation

Project Tutel Tutel MoE: An Optimized Mixture-of-Experts Implementation. Supported Framework: Pytorch Supported GPUs: CUDA(fp32 + fp16), ROCm(fp32) Ho

Microsoft 344 Dec 29, 2022
Pytorch Performace Tuning, WandB, AMP, Multi-GPU, TensorRT, Triton

Plant Pathology 2020 FGVC7 Introduction A deep learning model pipeline for training, experimentaiton and deployment for the Kaggle Competition, Plant

Bharat Giddwani 0 Feb 25, 2022
Linescanning - Package for (pre)processing of anatomical and (linescanning) fMRI data

line scanning repository This repository contains all of the tools used during the acquisition and postprocessing of line scanning data at the Spinoza

Jurjen Heij 4 Sep 14, 2022
Language Models Can See: Plugging Visual Controls in Text Generation

Language Models Can See: Plugging Visual Controls in Text Generation Authors: Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lin

Yixuan Su 195 Dec 22, 2022
Simply enable or disable your Nvidia dGPU

EnvyControl (WIP) Simply enable or disable your Nvidia dGPU Usage First clone this repo and install envycontrol with sudo pip install . CLI Turn off y

Victor Bayas 292 Jan 03, 2023
Python package for missing-data imputation with deep learning

MIDASpy Overview MIDASpy is a Python package for multiply imputing missing data using deep learning methods. The MIDASpy algorithm offers significant

MIDASverse 77 Dec 03, 2022
Anti-UAV base on PaddleDetection

Paddle-Anti-UAV Anti-UAV base on PaddleDetection Background UAVs are very popular and we can see them in many public spaces, such as parks and playgro

Qingzhong Wang 2 Apr 20, 2022
CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

CrossNorm (CN) and SelfNorm (SN) (Accepted at ICCV 2021) This is the official PyTorch implementation of our CNSN paper, in which we propose CrossNorm

100 Dec 28, 2022
A large dataset of 100k Google Satellite and matching Map images, resembling pix2pix's Google Maps dataset.

Larger Google Sat2Map dataset This dataset extends the aerial ⟷ Maps dataset used in pix2pix (Isola et al., CVPR17). The provide script download_sat2m

34 Dec 28, 2022
CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021

CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021 How to cite If you use these data please cite the o

Digital Linguistics 2 Dec 20, 2021
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

Pawel Dziemiach 1 Dec 19, 2021
The official implementation of the Hybrid Self-Attention NEAT algorithm

PUREPLES - Pure Python Library for ES-HyperNEAT About This is a library of evolutionary algorithms with a focus on neuroevolution, implemented in pure

Adrian Westh 91 Dec 12, 2022
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

Unsupervised Phone and Word Segmentation using Vector-Quantized Neural Networks Overview Unsupervised phone and word segmentation on speech data is pe

Herman Kamper 13 Dec 11, 2022