Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

Overview

Creating Robust Representations from Pre-Trained Image Encoders using Contrastive Learning

Sriram Ravula, Georgios Smyrnis

This is the code for our project "Creating Robust Representations from Pre-Trained Image Encoders using Contrastive Learning". We make use of contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations.

Requirements

In order to run the code for our models, it is necessary to install pytorch_lightning and all of its dependencies. Moreover, it is necessary that the following files from the OpenAI CLIP repository (https://github.com/openai/CLIP) are added, along with their respective requirements:

Structure

The following source files are required to execute the various experiments mentioned in our report:

  • baselines.py: Code which performs training and evaluation of the baseline end-to-end supervised model.
  • noisy_clip_dataparallel.py: Performs training and evaluation of the student model, based on the CLIP architecture.
  • zeroshot_validation.py: Performs evaluation of the zero-shot model.
  • linear_probe.py: Performs training and evaluation of a linear probe on top of the learned representations.
  • noise_level_testing.py: Evaluation of a trained model on various noise levels added in the input.
  • utils.py: General library for functions used throughout our code.

We also provide slice_imagenet100.py, a code to be used one time to generate the ImageNet-100 subset we used, as defined by imagenet100.txt. In order to run most of the code we provide, please first run this file with the proper source path to the full ImageNet dataset (can be downloaded separately at https://image-net.org/download) and desired destination path for the 100-class subset. Then, provide the path to your 100-class ImageNet subset in the yaml config files. For further details, refer to the comments in slice_imagenet100.py and the global variables set at the beginning of the script.

In the config/ folder, some sample configuration files for our experiments are included.

Examples

Using the following snippets of code, the experiments described in the report can be run. Note that editing the batch_size and gpus parameters of the sample files will lead to speedup and increased performance for the contrastive models.

  • Short_Evaluation_Demo.ipynb: A small demo of the types of distortions we use, as well as a comparison between the baseline and linear evaluations. You will need to download the checkpoints from the google drive link for this to run.
  • python baselines.py --config_file config/Supervised_CLIP_Baselines/sample.yaml: Train a baseline model, in an end-to-end supervised fashion.
  • python noisy_clip_dataparallel.py --config_file config/NoisyRN101/sample.yaml: Trains a CLIP model using contrastive learning.
  • python zeroshot_validation.py --config_file config/NoisyRN101/sample.yaml --ckpt_file rand90_zeroshot.ckpt: Performs zeroshot evaluation of a trained zero-shot clip model. The sample file to be used is the same one specified during training (for flexibility, checkpoint file provided separately).
  • python linear_probe.py --config_file config/LinearProbeSubset/sample.yaml: Trains a linear probe on top of a representation learned using contrastive loss. This requires the user to specify a checkpoint file in the yaml config file.
  • python noise_level_testing.py --config_file config/NoiseLevelTesting/sample.yaml: Evaluates a trained model for various levels of noise in the dataset. This requires the user to specify a checkpoint file in the yaml config file.
Owner
Sriram Ravula
Sriram Ravula
Lipstick ain't enough: Beyond Color-Matching for In-the-Wild Makeup Transfer (CVPR 2021)

Table of Content Introduction Datasets Getting Started Requirements Usage Example Training & Evaluation CPM: Color-Pattern Makeup Transfer CPM is a ho

VinAI Research 248 Dec 13, 2022
TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

Aritra Roy Gosthipaty 23 Dec 24, 2022
LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice,

LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice, for a model of choice, by iteratively removing each feature from the set, and eval

Ahmet Erdem 691 Dec 23, 2022
A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

bbc-speech-segmenter: Voice Activity Detection & Speaker Diarization A complete speech segmentation system using Kaldi and x-vectors for voice activit

BBC 16 Oct 27, 2022
Principled Detection of Out-of-Distribution Examples in Neural Networks

ODIN: Out-of-Distribution Detector for Neural Networks This is a PyTorch implementation for detecting out-of-distribution examples in neural networks.

189 Nov 29, 2022
Efficient training of deep recommenders on cloud.

HybridBackend Introduction HybridBackend is a training framework for deep recommenders which bridges the gap between evolving cloud infrastructure and

Alibaba 111 Dec 23, 2022
On the Adversarial Robustness of Visual Transformer

On the Adversarial Robustness of Visual Transformer Code for our paper "On the Adversarial Robustness of Visual Transformers"

Rulin Shao 35 Dec 14, 2022
HackBMU-5.0-Team-Ctrl-Alt-Elite - HackBMU 5.0 Team Ctrl Alt Elite

HackBMU-5.0-Team-Ctrl-Alt-Elite The search is over. We present to you ‘Health-A-

3 Feb 19, 2022
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

MSG-Transformer Official implementation of the paper MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens, by Jiemin

Hust Visual Learning Team 68 Nov 16, 2022
Object-aware Contrastive Learning for Debiased Scene Representation

Object-aware Contrastive Learning Official PyTorch implementation of "Object-aware Contrastive Learning for Debiased Scene Representation" by Sangwoo

43 Dec 14, 2022
AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS 2020) Introduction AdaShare is a novel and differentiable approach fo

94 Dec 22, 2022
Pytorch implementation of Bert and Pals: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

PyTorch implementation of BERT and PALs Introduction Work by Asa Cooper Stickland and Iain Murray, University of Edinburgh. Code for BERT and PALs; mo

Asa Cooper Stickland 70 Dec 29, 2022
Code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

Semi-supervised Deep Kernel Learning This is the code that accompanies the paper Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data

58 Oct 26, 2022
No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

This repository contains the implementation for the paper: No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consiste

Alireza Golestaneh 75 Dec 30, 2022
Code for "R-GCN: The R Could Stand for Random"

RR-GCN: Random Relational Graph Convolutional Networks PyTorch Geometric code for the paper "R-GCN: The R Could Stand for Random" RR-GCN is an extensi

PreDiCT.IDLab 31 Sep 07, 2022
Learning Logic Rules for Document-Level Relation Extraction

LogiRE Learning Logic Rules for Document-Level Relation Extraction We propose to introduce logic rules to tackle the challenges of doc-level RE. Equip

41 Dec 26, 2022
Image Segmentation Animation using Quadtree concepts.

QuadTree Image Segmentation Animation using QuadTree concepts. Usage usage: quad.py [-h] [-fps FPS] [-i ITERATIONS] [-ws WRITESTART] [-b] [-img] [-s S

Alex Eidt 29 Dec 25, 2022
Multispectral Object Detection with Yolov5

Multispectral-Object-Detection Intro Official Code for Cross-Modality Fusion Transformer for Multispectral Object Detection. Multispectral Object Dete

Richard Fang 121 Jan 01, 2023
Vector Neurons: A General Framework for SO(3)-Equivariant Networks

Vector Neurons: A General Framework for SO(3)-Equivariant Networks Created by Congyue Deng, Or Litany, Yueqi Duan, Adrien Poulenard, Andrea Tagliasacc

Congyue Deng 332 Dec 29, 2022
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Autoregressive Predictive Coding This repository contains the official implementation (in PyTorch) of Autoregressive Predictive Coding (APC) proposed

iamyuanchung 173 Dec 18, 2022