The AugNet Python module contains functions for the fast computation of image similarity.

Last update: Dec 28, 2022

Overview

AugNet

AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation arxiv link

In our work, we propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures. We develop a method to construct the similarities between pictures as distance metrics in the embedding space by leveraging the inter-correlation between augmented versions of samples. Our experiments demonstrate that the method is able to represent the image in low dimensional space and performs competitively in downstream tasks such as image classification and image similarity comparison. Moreover, unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets to train the feature extractor, but still shows comparable or even better feature representation ability and easy-to-use characteristics.

Install

pip install imgsim

Usage

import imgsim
import cv2

vtr = imgsim.Vectorizer()

img0 = cv2.imread("img0.png")
img1 = cv2.imread("img1.png")

vec0 = vtr.vectorize(img0)
vec1 = vtr.vectorize(img1)

dist = imgsim.distance(vec0, vec1)
print("distance =", dist)

Image Comparision Examples:

Please download the STL10 dataset from: https://cs.stanford.edu/~acoates/stl10/ and put the files under "./data/stl10_binary".

Please download the pretrained model from: https://drive.google.com/file/d/1pV3EBZPDDc3z_YKdRJu6ZBF5yn_IHhsK/view?usp=sharing and put the pth file under "./models"

Run "res34_model_training_with_STL.py" if you would like to train your own model. Run "kmeans_demo.ipynb" to test with K-Means clustering.

The followings are some image comparison examples. The left most images are the queries. The rest images are the topK most similar images that the algorithm found from the dataset based on the distances between the embeddings to the queries'.

Welcome to cite our work:

@misc{chen2021augnet,
    title={AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation},
    author={Mingxiang Chen and Zhanguo Chang and Haonan Lu and Bitao Yang and Zhuang Li and Liufang Guo and Zhecheng Wang},
    year={2021},
    eprint={2106.06250},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

TODO:

batch vectorization
multiple gpu

The AugNet Python module contains functions for the fast computation of image similarity.

Related tags

Overview

AugNet

Install

Usage

Image Comparision Examples:

Paris6k

Anime Illustrations:

Pokemons:

Humans Sketchs:

Welcome to cite our work:

TODO:

Owner

Ming

A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer

PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models

Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

A novel benchmark dataset for Monocular Layout prediction

Fast and Simple Neural Vocoder, the Multiband RNNMS

Repo for "Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks"

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

PyTorch implementations of algorithms for density estimation

An Artificial Intelligence trying to drive a car by itself on a user created map

🥇 LG-AI-Challenge 2022 1위 솔루션 입니다.

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Covid19-Forecasting - An interactive website that tracks, models and predicts COVID-19 Cases

Like ThreeJS but for Python and based on wgpu

for a paper about leveraging discourse markers for training new models

This is a five-step framework for the development of intrusion detection systems (IDS) using machine learning (ML) considering model realization, and performance evaluation.

This is the repository for our paper Ditch the Gold Standard: Re-evaluating Conversational Question Answering

TCPNet - Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition