The AugNet Python module contains functions for the fast computation of image similarity.

Last update: Dec 28, 2022

Overview

AugNet

AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation arxiv link

In our work, we propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures. We develop a method to construct the similarities between pictures as distance metrics in the embedding space by leveraging the inter-correlation between augmented versions of samples. Our experiments demonstrate that the method is able to represent the image in low dimensional space and performs competitively in downstream tasks such as image classification and image similarity comparison. Moreover, unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets to train the feature extractor, but still shows comparable or even better feature representation ability and easy-to-use characteristics.

Install

pip install imgsim

Usage

import imgsim
import cv2

vtr = imgsim.Vectorizer()

img0 = cv2.imread("img0.png")
img1 = cv2.imread("img1.png")

vec0 = vtr.vectorize(img0)
vec1 = vtr.vectorize(img1)

dist = imgsim.distance(vec0, vec1)
print("distance =", dist)

Image Comparision Examples:

Please download the STL10 dataset from: https://cs.stanford.edu/~acoates/stl10/ and put the files under "./data/stl10_binary".

Please download the pretrained model from: https://drive.google.com/file/d/1pV3EBZPDDc3z_YKdRJu6ZBF5yn_IHhsK/view?usp=sharing and put the pth file under "./models"

Run "res34_model_training_with_STL.py" if you would like to train your own model. Run "kmeans_demo.ipynb" to test with K-Means clustering.

The followings are some image comparison examples. The left most images are the queries. The rest images are the topK most similar images that the algorithm found from the dataset based on the distances between the embeddings to the queries'.

Welcome to cite our work:

@misc{chen2021augnet,
    title={AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation},
    author={Mingxiang Chen and Zhanguo Chang and Haonan Lu and Bitao Yang and Zhuang Li and Liufang Guo and Zhecheng Wang},
    year={2021},
    eprint={2106.06250},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

TODO:

batch vectorization
multiple gpu

The AugNet Python module contains functions for the fast computation of image similarity.

Related tags

Overview

AugNet

Install

Usage

Image Comparision Examples:

Paris6k

Anime Illustrations:

Pokemons:

Humans Sketchs:

Welcome to cite our work:

TODO:

Owner

Ming

People log into different sites every day to get information and browse through these sites one by one

MVSDF - Learning Signed Distance Field for Multi-view Surface Reconstruction

Learning Time-Critical Responses for Interactive Character Control

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

CATE: Computation-aware Neural Architecture Encoding with Transformers

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Code for "The Box Size Confidence Bias Harms Your Object Detector"

Source code release of the paper: Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation.

Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

PyDeepFakeDet is an integrated and scalable tool for Deepfake detection.

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images

Dyalog-apl-docset - Dyalog APL Dash Docset Generator

Negative Interactions for Improved Collaborative Filtering:

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

This is the code for the paper "Motion-Focused Contrastive Learning of Video Representations" (ICCV'21).

Relative Positional Encoding for Transformers with Linear Complexity

Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.

Run Keras models in the browser, with GPU support using WebGL