An implementation of the paper "A Neural Algorithm of Artistic Style"

Overview

A Neural Algorithm of Artistic Style implementation - Neural Style Transfer

This is an implementation of the research paper "A Neural Algorithm of Artistic Style" written by Leon A. Gatys, Alexander S. Ecker, Matthias Bethge.

Inspiration

The mechanism acting behind perceiving artistic images through biological vision is still unclear among scientists across the world. There exists no proper artificial system that perfectly interprets our visual experiences while understanding art. The method proposed in this paper is a significant step towards explaining how the biological vision might work while perceiving fine art.


Introduction

To quote authors Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, "in light of the striking similarities between performance-optimised artificial neural networks and biological vision, our work offers a path forward to an algorithmic understanding of how humans create and perceive artistic imagery.

The idea of Neural Style Transfer is taking a white noise as an input image, changing the input in such a way that it resembles the content of the content image and the texture/artistic style of the style image to reproduce it as a new artistic stylized image.

We define two distances, one for the content that measures how different the content between the two images is, and one for style that measures how different the style between the two images is. The aim is to transform the white noise input such that the the content-distance and style-distance is minimized (with the content and style image respectively).

Given below are some results from the original implementation


Model Componenets

Our Model architecture follows:

  • We have one module defining two classes responsible for calculating the loss functions for both content and style images and one for applying normalization on the desired values.
  • We have a second module which has three methods under one class NST -
    • A method for image preprocessing.
    • Content and Style Model Representation - We used the feature space provided by the 16 convolutional and 5 pooling layers of the VGG-19 Network. The five style reconstructions were generated by matching the style representations on layer 'conv1_1', 'conv2_1', 'conv3_1', 'conv4_1' and 'conv5_1. The generated style was matched with the content representation on layer 'conv4_2' to transform our input white noise into an image that applied the artistic style from the style image to the content of the content image by minimizing the values for both content and style loss respectively.
    • A method for training - We made a third method that calls the above methods to take content and style inputs from the user, preprocesses it and runs the neural style transfer algorithm on a white noise input image for 300 iterations using the LBFGS as the optimization function to output the generated image that is a combination of the given content and style images.


Implementation Details

  • PIL images have values between 0 and 255, but when transformed into torch tensors, their values are converted to be between 0 and 1. The images need to be resized to have the same dimensions. Neural networks from the torch library are trained with tensor values ranging from 0 to 1. The image_loader() function takes content and style image paths and loads them, creates a white noise input image, and returns the three tensors.
  • The style_model_and_losses() function is responsible for calculating and returning the content and style losses, and adding the content loss and style loss layers immediately after the convolution layer they are detecting.
  • To quote the authors, "To generate the images that mix the content of a photograph with the style of a painting we jointly minimise the distance of a white noise image from the content representation of the photograph in one layer of the network and the style representation of the painting in a number of layers of the CNN". The run_nst() function performs the neural transfer. For each iteration of the networks, an updated input is fed into it and new losses are computed. The backward methods of each loss module is run to dynamicaly compute their gradients. The optimizer requires a “closure()” function, to re-evaluate the module and return the loss.

Note - Owing to computational power limitations, the content and style images are resized to 512x512 when using a GPU or 128x128 when on a CPU. It is advisable to use a GPU for training because Neural Atyle Transfer is computationally very expensive.

Usage Guidelines

  • Cloning the Repository:

      git clone https://github.com/srijarkoroy/ArtiStyle
    
  • Entering the directory:

      cd ArtiStyle
    
  • Setting up the Python Environment with dependencies:

      pip install -r requirements.txt
    
  • Running the file:

      python3 test.py
    

Note: Before running the test file please ensure that you mention a valid path to a content and style image and also set path='path to save the output image' if you want to save your image

Check out the demo notebook here.

Results from implementation

Content Image Style Image Output Image

Contributors

Owner
Srijarko Roy
AI Enthusiast!
Srijarko Roy
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement In this project, we proposed a Domain Disentanglement Faster-RCNN (DDF)

19 Nov 24, 2022
TensorFlow implementation of Elastic Weight Consolidation

Elastic weight consolidation Introduction A TensorFlow implementation of elastic weight consolidation as presented in Overcoming catastrophic forgetti

James Stokes 67 Oct 11, 2022
A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: "NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion". NÜWA is a unified multimodal

Microsoft 2.6k Jan 03, 2023
AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation A pytorch-version implementation codes of paper:

11 Dec 13, 2022
a Pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021"

A pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021" 1. Notes This is a pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in

91 Dec 26, 2022
Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

[ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering Getting Started DSR has been implemented and tested on Ubunt

Sai Kumar Dwivedi 83 Nov 27, 2022
Instance-wise Feature Importance in Time (FIT)

Instance-wise Feature Importance in Time (FIT) FIT is a framework for explaining time series perdiction models, by assigning feature importance to eve

Sana 46 Dec 25, 2022
BabelCalib: A Universal Approach to Calibrating Central Cameras. In ICCV (2021)

BabelCalib: A Universal Approach to Calibrating Central Cameras This repository contains the MATLAB implementation of the BabelCalib calibration frame

Yaroslava Lochman 55 Dec 30, 2022
Dynamic Environments with Deformable Objects (DEDO)

DEDO - Dynamic Environments with Deformable Objects DEDO is a lightweight and customizable suite of environments with deformable objects. It is aimed

Rika 32 Dec 22, 2022
Implementation of Google Brain's WaveGrad high-fidelity vocoder

WaveGrad Implementation (PyTorch) of Google Brain's high-fidelity WaveGrad vocoder (paper). First implementation on GitHub with high-quality generatio

Ivan Vovk 363 Dec 27, 2022
Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Toward Practical Monocular Indoor Depth Estimation Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su [arXiv] [project site] DistDe

Meta Research 122 Dec 13, 2022
[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning. This repo contains the PyTorch code and implementation for the paper E

Akuchi 18 Dec 22, 2022
Easy to use Audio Tagging in PyTorch

Audio Classification, Tagging & Sound Event Detection in PyTorch Progress: Fine-tune on audio classification Fine-tune on audio tagging Fine-tune on s

sithu3 15 Dec 22, 2022
GAN example for Keras. Cuz MNIST is too small and there should be something more realistic.

Keras-GAN-Animeface-Character GAN example for Keras. Cuz MNIST is too small and there should an example on something more realistic. Some results Trai

160 Sep 20, 2022
This code finds bounding box of a single human mouth.

This code finds bounding box of a single human mouth. In comparison to other face segmentation methods, it is relatively insusceptible to open mouth conditions, e.g., yawning, surgical robots, etc. T

iThermAI 4 Nov 27, 2022
RRL: Resnet as representation for Reinforcement Learning

Resnet as representation for Reinforcement Learning (RRL) is a simple yet effective approach for training behaviors directly from visual inputs. We demonstrate that features learned by standard image

Meta Research 21 Dec 07, 2022
Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP 2021.

The Stem Cell Hypothesis Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP

Emory NLP 5 Jul 08, 2022
sktime companion package for deep learning based on TensorFlow

NOTE: sktime-dl is currently being updated to work correctly with sktime 0.6, and wwill be fully relaunched over the summer. The plan is Refactor and

sktime 573 Jan 05, 2023
Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Improving Transferability of Representations via Augmentation-Aware Self-Supervision Accepted to NeurIPS 2021 TL;DR: Learning augmentation-aware infor

hankook 38 Sep 16, 2022