This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Koltun"

Related tags

Deep Learninglpo
Overview

Learning to propose objects

This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Koltun, CVPR 2015".

Dependencies:

  • c++11 compiler (gcc >= 4.7)
  • cmake
  • boost-python
  • python (2.7 or 3.1+ should both work)
  • numpy
  • libmatio (optional)
  • libpng, libjpeg
  • Eigen 3 (3.2.0 or newer)
  • OpenMP (optional but recommended)

Compilation:

Go to the top level directory

mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DDATA_DIR=/path/to/datasets -DUSE_PYTHON=ON
make -j9

Here "-DUSE_PYTHON" specifies that the python wrapper should be built (highly recommended). You can use python 2.7 by specifying "-DUSE_PYTHON=2", any other argument will try to build a python 3 wrapper.

The flag "-DDATA_DIR=/path/to/datasets" is optional and can point to a directory containing the VOC2012, VOC2007 or COCO datset. Specify this path if you want to train or evaluate LPO on those dataset.

"/path/to/datasets" can be any directory containing subdirectories:

  • 'VOC2012/ImageSets'
  • 'VOC2012/SegmentationClass',
  • 'VOC2012/Annotations'
  • 'COCO/train2014'
  • 'COCO/val2014'
  • ...

and files:

  • 'COCO/instances_train2014.json'
  • 'COCO/instances_val2014.json'.

The coco files can be downloaded from http://mscoco.org/, the PASCAL VOC dataset http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2012/index.html .

The code should compile and run fine on both Linux and Mac OS, let me know if you have any difficulty or find a bug. For Windows you're on your own.

Experiments

The code to reproduce most results in the paper is included here. All experiments should be run from the src directory.

To generate the main comparison in table 3 run:

bash eval_all.sh

To analyze a model like table 2 run:

python analyze_model.py path/to/model

To do the bounding box evaluation call:

python eval_box.py path/to/output_file path/to/model1 path/to/model2 path/to/model3 path/to/model4

This will create a binary file measuring number of proposals vs best overlap per object. You can then use the results/box.py script to generate the bounding box evaluation and produce the plots. For your convenience we included the precomputed results of many prior methods on VOC 2012 in results/box/*.dat.

Citation

If you're using this code in a scientific publication please cite:

@inproceedings{kk-lpo-15,
  author    = {Philipp Kr{\"{a}}henb{\"{u}}hl and
               Vladlen Koltun},
  title     = {Learning to Propose Objects},
  booktitle = {CVPR},
  year      = {2015},
}

License

All my code is published under a BSD license, so feel free to reuse and/or share it. There are some dependencies which are under different licenses and/or patented. All those dependencies are located in the external directory.

Owner
Philipp Krähenbühl
Philipp Krähenbühl
Video-Music Transformer

VMT Video-Music Transformer (VMT) is an attention-based multi-modal model, which generates piano music for a given video. Paper https://arxiv.org/abs/

Chin-Tung Lin 5 Jul 13, 2022
PECOS - Prediction for Enormous and Correlated Spaces

PECOS - Predictions for Enormous and Correlated Output Spaces PECOS is a versatile and modular machine learning (ML) framework for fast learning and i

Amazon 387 Jan 04, 2023
Compact Bidirectional Transformer for Image Captioning

Compact Bidirectional Transformer for Image Captioning Requirements Python 3.8 Pytorch 1.6 lmdb h5py tensorboardX Prepare Data Please use git clone --

YE Zhou 19 Dec 12, 2022
Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

Geotrend 79 Dec 28, 2022
A New Open-Source Off-road Environment for Benchmark Generalization of Autonomous Driving

A New Open-Source Off-road Environment for Benchmark Generalization of Autonomous Driving Isaac Han, Dong-Hyeok Park, and Kyung-Joong Kim IEEE Access

13 Dec 27, 2022
Keras + Hyperopt: A very simple wrapper for convenient hyperparameter optimization

This project is now archived. It's been fun working on it, but it's time for me to move on. Thank you for all the support and feedback over the last c

Max Pumperla 2.1k Jan 03, 2023
Official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer"

[AAAI2022] UCTransNet This repo is the official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspectiv

Haonan Wang 199 Jan 03, 2023
Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

T2I_CL This is the official Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning Requirements Linux Python

42 Dec 31, 2022
Just Go with the Flow: Self-Supervised Scene Flow Estimation

Just Go with the Flow: Self-Supervised Scene Flow Estimation Code release for the paper Just Go with the Flow: Self-Supervised Scene Flow Estimation,

Himangi Mittal 50 Nov 22, 2022
Source code for the NeurIPS 2021 paper "On the Second-order Convergence Properties of Random Search Methods"

Second-order Convergence Properties of Random Search Methods This repository the paper "On the Second-order Convergence Properties of Random Search Me

Adamos Solomou 0 Nov 13, 2021
A Simple and Versatile Framework for Object Detection and Instance Recognition

SimpleDet - A Simple and Versatile Framework for Object Detection and Instance Recognition Major Features FP16 training for memory saving and up to 2.

TuSimple 3k Dec 12, 2022
PyElastica is the Python implementation of Elastica, an open-source software for the simulation of assemblies of slender, one-dimensional structures using Cosserat Rod theory.

PyElastica PyElastica is the python implementation of Elastica: an open-source project for simulating assemblies of slender, one-dimensional structure

Gazzola Lab 105 Jan 09, 2023
GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data

GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data By Shuchang Zhou, Taihong Xiao, Yi Yang, Dieqiao Feng, Qinyao He, W

Taihong Xiao 141 Apr 16, 2021
Source code for the paper: Variance-Aware Machine Translation Test Sets (NeurIPS 2021 Datasets and Benchmarks Track)

Variance-Aware-MT-Test-Sets Variance-Aware Machine Translation Test Sets License See LICENSE. We follow the data licensing plan as the same as the WMT

NLP2CT Lab, University of Macau 5 Dec 21, 2021
Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

Fast Training of Neural Lumigraph Representations using Meta Learning Project Page | Paper | Data Alexander W. Bergman, Petr Kellnhofer, Gordon Wetzst

Alex 39 Oct 08, 2022
DeepDiffusion: Unsupervised Learning of Retrieval-adapted Representations via Diffusion-based Ranking on Latent Feature Manifold

DeepDiffusion Introduction This repository provides the code of the DeepDiffusion algorithm for unsupervised learning of retrieval-adapted representat

4 Nov 15, 2022
EfficientNetV2 implementation using PyTorch

EfficientNetV2-S implementation using PyTorch Train Steps Configure imagenet path by changing data_dir in train.py python main.py --benchmark for mode

Jahongir Yunusov 86 Dec 29, 2022
GLANet - The code for Global and Local Alignment Networks for Unpaired Image-to-Image Translation arxiv

GLANet The code for Global and Local Alignment Networks for Unpaired Image-to-Image Translation arxiv Framework: visualization results: Getting Starte

stanley 29 Dec 14, 2022
A python module for configuration of block devices

Blivet is a python module for system storage configuration. CI status Licence See COPYING Installation From Fedora repositories Blivet is available in

78 Dec 14, 2022
Rotary Transformer

[中文|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative

325 Jan 03, 2023