Joint Unsupervised Learning (JULE) of Deep Representations and Image Clusters.

Overview

Joint Unsupervised Learning (JULE) of Deep Representations and Image Clusters.

Overview

This project is a Torch implementation for our CVPR 2016 paper, which performs jointly unsupervised learning of deep CNN and image clusters. The intuition behind this is that better image representation will facilitate clustering, while better clustering results will help representation learning. Given a unlabeled dataset, it will iteratively learn CNN parameters unsupervisedly and cluster images.

Disclaimer

This is a torch version reimplementation to the code used in our CVPR paper. There is a slight difference between the code used to report the results in our paper. The Caffe version code can be found here.

License

This code is released under the MIT License (refer to the LICENSE file for details).

Citation

If you find our code is useful in your researches, please consider citing:

@inproceedings{yangCVPR2016joint,
    Author = {Yang, Jianwei and Parikh, Devi and Batra, Dhruv},
    Title = {Joint Unsupervised Learning of Deep Representations and Image Clusters},
    Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    Year = {2016}
}

Dependencies

  1. Torch. Install Torch by:

    $ curl -s https://raw.githubusercontent.com/torch/ezinstall/master/install-deps | bash
    $ git clone https://github.com/torch/distro.git ~/torch --recursive
    $ cd ~/torch; 
    $ ./install.sh      # and enter "yes" at the end to modify your bashrc
    $ source ~/.bashrc

    After installing torch, you may also need install some packages using LuaRocks:

    $ luarocks install nn
    $ luarocks install image 

    It is preferred to run the code on GPU. Thus you need to install cunn:

    $ luarocks install cunn
  2. lua-knn. It is used to compute the distance between neighbor samples. Go into the folder, and then compile it with:

    $ luarocks make

Typically, you can run our code after installing the above two packages. Please let me know if error occurs.

Installation Using Nvidia-Docker

  1. Run docker build -t .
  2. Run nvidia-docker run -it /bin/bash

Train model

  1. It is very simple to run the code for training model. For example, if you want to train on USPS dataset, you can run:

    $ th train.lua -dataset USPS -eta 0.9

    Note that it runs on fast mode by default. You can change it to regular mode by setting "-use_fast 0". In the above command, eta is the unfolding rate. For face dataset, we recommand 0.2, while for other datasets, it is set to 0.9 to save training time. During training, you will see the normalize mutual information (NMI) for the clustering results.

  2. You can train multiple models in parallel by:

    $ th train.lua -dataset USPS -eta 0.9 -num_nets 5

    By this way, you weill get 5 different models, and thus 5 possible different results. Statistics such as mean and stddev can be computed on these results.

  3. You can also get the clustering performance when using raw image data and random CNN by

    $ th train.lua -dataset USPS -eta 0.9 -updateCNN 0
  4. You can also change other hyper parameters for model training, such as K_s, K_c, number of epochs in each partial unrolled period, etc.

Datasets

We upload six small datasets: COIL-20, USPS, MNIST-test, CMU-PIE, FRGC, UMist. The other large datasets, COIL-100, MNIST-full and YTF can be found in my google drive here.

Train on your own datasets

Alternatively, you can train the model on your own dataset. As preparations, you need:

  1. Create a hdf5 file with size of NxCxHxW, where N is the total number of images, C is the number of channels, H is the height of image, and W the width of image. Then move it to datasets/dataset_name/data4torch.h5

  2. Create a lua file to define the network architecture for your dataset. Put it in models_def/dataset_name.lua.

  3. Afterwards, you can run train.lua by specifying the dataset name as your own dataset. That's it!

Compared Approaches

We upload the code for the compared approaches in matlab folder. Please refer to the original paper for details and cite them properly. In this foler, we also attach the evaluation code for two metric: normalized mutual information (NMI) and clustering accuracy (AC).

Q&A

You are welcome to send message to (jw2yang at vt.edu) if you have any issue on this code.

Owner
Jianwei Yang
Senior Researcher @ Microsoft
Jianwei Yang
a grammar based feedback fuzzer

Nautilus NOTE: THIS IS AN OUTDATE REPOSITORY, THE CURRENT RELEASE IS AVAILABLE HERE. THIS REPO ONLY SERVES AS A REFERENCE FOR THE PAPER Nautilus is a

Chair for Sys­tems Se­cu­ri­ty 158 Dec 28, 2022
Automatic Calibration for Non-repetitive Scanning Solid-State LiDAR and Camera Systems

ACSC Automatic extrinsic calibration for non-repetitive scanning solid-state LiDAR and camera systems. System Architecture 1. Dependency Tested with U

KINO 192 Dec 13, 2022
TensorFlow implementation of Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction)

Barlow-Twins-TF This repository implements Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction) in TensorFlow and demonstrat

Sayak Paul 36 Sep 14, 2022
GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs [Paper, Slides, Video Talk] at USENIX OSDI'21 @inproceedings{GNNAdvisor, title=

YUKE WANG 47 Jan 03, 2023
Suite of 500 procedurally-generated NLP tasks to study language model adaptability

TaskBench500 The TaskBench500 dataset and code for generating tasks. Data The TaskBench dataset is available under wget http://web.mit.edu/bzl/www/Tas

Belinda Li 20 May 17, 2022
Send text to girlfriend in the morning

Girlfriend Text Send text to girlfriend (or really anyone with a phone number) in the morning 1. Configure your settings in utils.py. phone_number = "

Paras Adhikary 199 Oct 25, 2022
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

Squirrel Core Share, load, and transform data in a collaborative, flexible, and efficient way What is Squirrel? Squirrel is a Python library that enab

Merantix Momentum 249 Dec 07, 2022
MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images This repository contains the implementation of our paper MetaAvatar: Learni

sfwang 96 Dec 13, 2022
SCU OlympicsRunning Baseline

Competition 1v1 running Environment check details in Jidi Competition RLChina2021智能体竞赛 做出的修改: 奖励重塑:修改了环境,重新设置了奖励的分配,使得奖励组成不只有零和博弈,还有探索环境的奖励。 算法微调:修改了官

ZiSeoi Wong 2 Nov 23, 2021
Notebooks em Python para Métodos Eletromagnéticos

GeoSci Labs This is a repository of code used to power the notebooks and interactive examples for https://em.geosci.xyz and https://gpg.geosci.xyz. Th

Victor Cezar Tocantins 1 Nov 16, 2021
This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

BUPT GAMMA Lab 519 Jan 02, 2023
A Python module for the generation and training of an entry-level feedforward neural network.

ff-neural-network A Python module for the generation and training of an entry-level feedforward neural network. This repository serves as a repurposin

Riadh 2 Jan 31, 2022
Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

Xinyu Hua 31 Oct 13, 2022
Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

53 Nov 22, 2022
Deep-learning X-Ray Micro-CT image enhancement, pore-network modelling and continuum modelling

EDSR modelling A Github repository for deep-learning image enhancement, pore-network and continuum modelling from X-Ray Micro-CT images. The repositor

Samuel Jackson 7 Nov 03, 2022
Benchmarking the robustness of Spatial-Temporal Models

Benchmarking the robustness of Spatial-Temporal Models This repositery contains the code for the paper Benchmarking the Robustness of Spatial-Temporal

Yi Chenyu Ian 15 Dec 16, 2022
TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors This package provides a simulator for vision-based

Facebook Research 255 Dec 27, 2022
Visual Memorability for Robotic Interestingness via Unsupervised Online Learning (ECCV 2020 Oral and TRO)

Visual Interestingness Refer to the project description for more details. This code based on the following paper. Chen Wang, Yuheng Qiu, Wenshan Wang,

Chen Wang 36 Sep 08, 2022
Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

multilingual-mrc-isdg Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph". This r

Liyan 5 Dec 07, 2022
Keras code and weights files for popular deep learning models.

Trained image classification models for Keras THIS REPOSITORY IS DEPRECATED. USE THE MODULE keras.applications INSTEAD. Pull requests will not be revi

François Chollet 7.2k Dec 29, 2022