Code and training data for our ECCV 2016 paper on Unsupervised Learning

Overview

Shuffle and Learn (Shuffle Tuple)

Created by Ishan Misra

Based on the ECCV 2016 Paper - "Shuffle and Learn: Unsupervised Learning using Temporal Order Verification" link to paper.

This codebase contains the model and training data from our paper.

Introduction

Our code base is a mix of Python and C++ and uses the Caffe framework. Design decisions and some code is derived from the Fast-RCNN codebase by Ross Girshick.

Citing

If you find our code useful in your research, please consider citing:

@inproceedings{misra2016unsupervised,
  title={{Shuffle and Learn: Unsupervised Learning using Temporal Order Verification}},
  author={Misra, Ishan and Zitnick, C. Lawrence and Hebert, Martial},
  booktitle={ECCV},
  year={2016}
}

Benchmark Results

We summarize the results of finetuning our method here (details in the paper).

Action Recognition

| Dataset | Accuracy (split 1) | Accuracy (mean over splits) :--- | :--- | :--- | :--- UCF101 | 50.9 | 50.2 HMDB51 | 19.8 | 18.1

Pascal Action Classification (VOC2012): Coming soon

Pose estimation

  • FLIC: PCK (Mean, AUC) 84.7, 49.6
  • MPII: [email protected] (Upper, Full, AUC): 87.7, 85.8, 47.6

Object Detection

  • PASCAL VOC2007 test mAP of 42.4% using Fast RCNN.

We initialize conv1-5 using our unsupervised pre-training. We initialize fc6-8 randomly. We then follow the procedure from Krahenbuhl et al., 2016 to rescale our network and finetune all layers using their hyperparameters.

Surface Normal Prediction

  • NYUv2 (Coming soon)

Contents

  1. Requirements: software
  2. Models and Training Data
  3. Usage
  4. Utils

Requirements: software

  1. Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers and OpenCV.

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
USE_OPENCV := 1

You can download a compatible fork of Caffe from here. Note that since our model requires Batch Normalization, you will need to have a fairly recent fork of caffe.

Models and Training Data

  1. Our model trained on tuples from UCF101 (train split 1, without using action labels) can be downloaded here.

  2. The tuples used for training our model can be downloaded as a zipped text file here. Each line of the file train01_image_keys.txt defines a tuple of three frames. The corresponding file train01_image_labs.txt has a binary label indicating whether the tuple is in the correct or incorrect order.

  3. Using the training tuples requires you to have the raw videos from the UCF101 dataset (link to videos). We extract frames from the videos and resize them such that the max dimension is 340 pixels. You can use ffmpeg to extract the frames. Example command: ffmpeg -i <video_name> -qscale 1 -f image2 <video_sub_name>/<video_sub_name>_%06d.jpg, where video_sub_name is the name of the raw video without the file extension.

Usage

  1. Once you have downloaded and formatted the UCF101 videos, you can use the networks/tuple_train.prototxt file to train your network. The only complicated part in the network definition is the data layer, which reads a tuple and a label. The data layer source file is in the python_layers subdirectory. Make sure to add this to your PYTHONPATH.
  2. Training for Action Recognition: We used the codebase from here
  3. Training for Pose Estimation: We used the codebase from here. Since this code does not use caffe for training a network, I have included a experimental data layer for caffe in python_layers/pose_data_layer.py

Utils

This repo also includes a bunch of utilities I used for training and debugging my models

  • python_layers/loss_tracking_layer: This layer tracks loss of each individual data point and its class label. This is useful for debugging as one can see the loss per class across epochs. Thanks to Abhinav Shrivastava for discussions on this.
  • model_training_utils: This is the wrapper code used to train the network if one wants to use the loss_tracking layer. These utilities not only track the loss, but also keep a log of various other statistics of the network - weights of the layers, norms of the weights, magnitude of change etc. For an example of how to use this check networks/tuple_exp.py. Thanks to Carl Doersch for discussions on this.
  • python_layers/multiple_image_multiple_label_data_layer: This is a fairly generic data layer that can read multiple images and data. It is based off my data layers repo.
Owner
Ishan Misra
Ishan Misra
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning [CVPR'21, Oral] By Zhicheng Huang*, Zhaoyang Zeng*, Yupan H

Multimedia Research 196 Dec 13, 2022
This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

Maurice Weiler 59 Dec 09, 2022
keyframes-CNN-RNN(action recognition)

keyframes-CNN-RNN(action recognition) Environment: python=3.7 pytorch=1.2 Datasets: Following the format of UCF101 action recognition. Run steps: Mo

4 Feb 09, 2022
Custom Implementation of Non-Deep Networks

ParNet Custom Implementation of Non-deep Networks arXiv:2110.07641 Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun Official Repository https

Pritama Kumar Nayak 20 May 27, 2022
Latex code for making neural networks diagrams

PlotNeuralNet Latex code for drawing neural networks for reports and presentation. Have a look into examples to see how they are made. Additionally, l

Haris Iqbal 18.6k Jan 01, 2023
ANEA: Distant Supervision for Low-Resource Named Entity Recognition

ANEA: Distant Supervision for Low-Resource Named Entity Recognition ANEA is a tool to automatically annotate named entities in unlabeled text based on

Saarland University Spoken Language Systems Group 15 Mar 30, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

CoGAIL Table of Content Overview Installation Dataset Training Evaluation Trained Checkpoints Acknowledgement Citations License Overview This reposito

Jeremy Wang 29 Dec 24, 2022
Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch

Next Word Prediction Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch 🎬 Project Demo ✔ Application is hosted on Streamlit. You can see t

Vivek7 3 Aug 26, 2022
Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

NLOS-OT Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted) Description In this reposit

Ruixu Geng(耿瑞旭) 16 Dec 16, 2022
Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Ceph.

Project Aquarium Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Cep

Aquarist Labs 73 Jul 21, 2022
A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.

imutils A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displ

Adrian Rosebrock 4.3k Jan 08, 2023
Safe Control for Black-box Dynamical Systems via Neural Barrier Certificates

Safe Control for Black-box Dynamical Systems via Neural Barrier Certificates Installation Clone the repository: git clone https://github.com/Zengyi-Qi

Zengyi Qin 3 Oct 18, 2022
Convolutional Neural Network for Text Classification in Tensorflow

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post. It is slightly simplified implementation of Kim's Convo

Denny Britz 5.5k Jan 02, 2023
U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI

U-Net for brain segmentation U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI based on a deep learning segmentation alg

562 Jan 02, 2023
[ICLR2021oral] Rethinking Architecture Selection in Differentiable NAS

DARTS-PT Code accompanying the paper ICLR'2021: Rethinking Architecture Selection in Differentiable NAS Ruochen Wang, Minhao Cheng, Xiangning Chen, Xi

Ruochen Wang 86 Dec 27, 2022
Code for our ALiBi method for transformer language models.

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation This repository contains the code and models for our paper Tra

Ofir Press 211 Dec 31, 2022
Python implementation of "Elliptic Fourier Features of a Closed Contour"

PyEFD An Python/NumPy implementation of a method for approximating a contour with a Fourier series, as described in [1]. Installation pip install pyef

Henrik Blidh 71 Dec 09, 2022
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Tensor2Tensor Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and ac

12.9k Jan 09, 2023
A new version of the CIDACS-RL linkage tool suitable to a cluster computing environment.

Fully Distributed CIDACS-RL The CIDACS-RL is a brazillian record linkage tool suitable to integrate large amount of data with high accuracy. However,

Robespierre Pita 5 Nov 04, 2022