A computer vision pipeline to identify the "icons" in Christian paintings

Overview

Christian-Iconography

Open In Colab Screenshot from 2022-01-08 18-26-30

A computer vision pipeline to identify the "icons" in Christian paintings.

A bit about iconography.

Iconography is related to identifying the subject itself in the image. So, for instance when I say Christian Iconography I would mean that I am trying to identify some objects like crucifix or mainly in this project the saints!

Inspiration

I was looking for some interesting problem to solve and I came across RedHenLab's barnyard of projects and it had some really wonderful ideas there and this particular one intrigued me. On the site they didn't have much progress on it as the datasets were not developed on this subject but after surfing around I found something and just like that I got started!

Dataset used.

The project uses the ArtDL dataset which contains 42,479 images of artworks portraying Christian saints, divided in 10 classes: Saint Dominic (iconclass 11HH(DOMINIC)), Saint Francis of Assisi (iconclass 11H(FRANCIS)), Saint Jerome (iconclass 11H(JEROME)), Saint John the Baptist (iconclass 11H(JOHN THE BAPTIST)), Saint Anthony of Padua (iconclass 11H(ANTONY OF PADUA), Saint Mary Magdalene (iconclass 11HH(MARY MAGDALENE)), Saint Paul (iconclass 11H(PAUL)), Saint Peter (iconclass 11H(PETER)), Saint Sebastian (iconclass 11H(SEBASTIAN)) and Virgin Mary (iconclass 11F). All images are associated with high-level annotations specifying which iconography classes appear in them (from a minimum of 1 class to a maximum of 7 classes).

Sources

Screenshot from 2022-01-08 18-08-56

Preprocessing steps.

All the images were first padded so that the resolution is sort of intact when the image is resized. A dash of normalization and some horizontal flips and the dataset is ready to be eaten/trained on by our model xD.

Architecture used.

As mentioned the ArtDL dataset has around 43k images and hence training it completely wouldn't make sense. Hence a ResNet50 pretrained model was used.

But there is a twist.

Instead of just having the final classifying layer trained we only freeze the initial layer as it has gotten better at recognizing patterns from a lot of images it might have trained on. And then we fine-tune the deeper layers so that it learns the art after the initial abstraction. Another deviation is to replace the final linear layer by 1x1 conv layer to make the classification.

Quantiative Results.

Training

I trained the network for 10 epochs which took around 3 hours and used Stochastic Gradient Descent with LR=0.01 and momentum 0.9. The accuracy I got was 64% on the test set which can be further improved.

Classification Report

Screenshot from 2022-01-10 22-07-52

From the classification report it is clear that Saint MARY has the most number of samples in the training set and the precision for that is high. On the other hand other samples are low in number and hence their scores are low and hence we can't infer much except the fact that we need to oversample some of these classes so that we can gain more meaningful resuls w.r.t accuracy and of course these metrics as well

Qualitative Results

We try an image of Saint Dominic and see what our classifier is really learning.

Screenshot from 2022-01-10 22-10-37

Saliency Map

Screenshot from 2022-01-10 22-12-31

We can notice that regions around are more lighter than elsewhere which could mean that our classifier at least knows where to look :p

Guided-Backpropagation

Screenshot from 2022-01-10 22-14-26

So what really guided backprop does is that it points out the positve influences while classifiying an image. From this result we can see that it is really ignoring the padding applied and focussing more on the body and interesting enough the surroundings as well

Grad-CAM!

Screenshot from 2022-01-10 22-15-27

As expected the Grad-CAM when used shows the hot regions in our images and it is around the face and interesting enough the surrounding so maybe it could be that surroundings do have a role-play in type of saint?

Possible improvements.

  • Finding more datasets
  • Or working on the architecture maybe?
  • Using GANs to generate samples and make classifier stronger

Citations

@misc{milani2020data,
title={A Data Set and a Convolutional Model for Iconography Classification in Paintings},
author={Federico Milani and Piero Fraternali},
eprint={2010.11697},
archivePrefix={arXiv},
primaryClass={cs.CV},
year={2020}
}

RedhenLab's barnyard of projects

Owner
Rishab Mudliar
AKA Start At The Beginning.
Rishab Mudliar
Heat transfer problemas solved using python

heat-transfer Heat transfer problems solved using python isolation-convection.py compares the temperature distribution on the problem as shown in the

2 Nov 14, 2021
Pytorch implemenation of Stochastic Multi-Label Image-to-image Translation (SMIT)

SMIT: Stochastic Multi-Label Image-to-image Translation This repository provides a PyTorch implementation of SMIT. SMIT can stochastically translate a

Biomedical Computer Vision Group @ Uniandes 37 Mar 01, 2022
2021 National Underwater Robotics Vision Optics

2021-National-Underwater-Robotics-Vision-Optics 2021年全国水下机器人算法大赛-光学赛道-B榜精度第18名 (Kilian_Di的团队:A榜[email pro

Di Chang 9 Nov 04, 2022
Repo for the Tutorials of Day1-Day3 of the Nordic Probabilistic AI School 2021 (https://probabilistic.ai/)

ProbAI 2021 - Probabilistic Programming and Variational Inference Tutorial with Pryo Day 1 (June 14) Slides Notebook: students_PPLs_Intro Notebook: so

PGM-Lab 46 Nov 01, 2022
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

Vision Research Lab @ UCSB 26 Nov 29, 2022
Code, Models and Datasets for OpenViDial Dataset

OpenViDial This repo contains downloading instructions for the OpenViDial dataset in 《OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Vis

119 Dec 08, 2022
This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Code-and-Dataset-for-CapSal This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detec

lu zhang 48 Aug 19, 2022
Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Knover Knover is a toolkit for knowledge grounded dialogue generation based on PaddlePaddle. Knover allows researchers and developers to carry out eff

607 Dec 31, 2022
It is a simple library to speed up CLIP inference up to 3x (K80 GPU)

CLIP-ONNX It is a simple library to speed up CLIP inference up to 3x (K80 GPU) Usage Install clip-onnx module and requirements first. Use this trick !

Gerasimov Maxim 93 Dec 20, 2022
A 10000+ hours dataset for Chinese speech recognition

WenetSpeech Official website | Paper A 10000+ Hours Multi-domain Chinese Corpus for Speech Recognition Download Please visit the official website, rea

310 Jan 03, 2023
Source code for Transformer-based Multi-task Learning for Disaster Tweet Categorisation (UCD's participation in TREC-IS 2020A, 2020B and 2021A).

Source code for "UCD participation in TREC-IS 2020A, 2020B and 2021A". *** update at: 2021/05/25 This repo so far relates to the following work: Trans

Congcong Wang 4 Oct 19, 2021
[ICCV2021] Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving

Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving

Xuanchi Ren 44 Dec 03, 2022
Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit

streamlit-manim Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit Installation I had to install pango with sudo apt-get

Adrien Treuille 6 Aug 03, 2022
codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

15 Aug 30, 2022
Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation This repository contains the official PyTorch implementation of the following

Wonjong Jang 270 Dec 30, 2022
Ascend your Jupyter Notebook usage

Jupyter Ascending Sync Jupyter Notebooks from any editor About Jupyter Ascending lets you edit Jupyter notebooks from your favorite editor, then insta

Untitled AI 254 Jan 08, 2023
A task Provided by A respective Artenal Ai and Ml based Company to complete it

A task Provided by A respective Alternal Ai and Ml based Company to complete it .

Parth Madan 1 Jan 25, 2022
FS-Mol: A Few-Shot Learning Dataset of Molecules

FS-Mol is A Few-Shot Learning Dataset of Molecules, containing molecular compounds with measurements of activity against a variety of protein targets. The dataset is presented with a model evaluation

Microsoft 114 Dec 15, 2022
Supplementary code for the AISTATS 2021 paper "Matern Gaussian Processes on Graphs".

Matern Gaussian Processes on Graphs This repo provides an extension for gpflow with Matérn kernels, inducing variables and trainable models implemente

41 Dec 17, 2022
Integrated physics-based and ligand-based modeling.

ComBind ComBind integrates data-driven modeling and physics-based docking for improved binding pose prediction and binding affinity prediction. Given

Dror Lab 44 Oct 26, 2022