PyTorch implementation of MulMON

Overview

MulMON

This repository contains a PyTorch implementation of the paper:
Learning Object-Centric Representations of Multi-object Scenes from Multiple Views

Li Nanbo, Cian Eastwood, Robert B. Fisher
NeurIPS 2020 (Spotlight)

Working examples

Check our video presentation for more: https://youtu.be/Og2ic2L77Pw.

Requirements

Hardware:

  • GPU. Currently, at least one GPU device is required to run this code, however, we will consider adding CPU demo code in the future.
  • Disk space: we do NOT have any hard requirement for the disk space, this is totally data-dependent. To use all the datasets we provide, you will need ~9GB disk space. However, it is not necessary to use all of our datasets (or even our datasets), see Data section for more details.

Python Environement:

  1. We use Anaconda to manage our python environment. Check conda installation guide here: https://docs.anaconda.com/anaconda/install/linux/.

  2. Open a new terminal, direct to the MulMON directory:

cd <YOUR-PATH-TO-MulMON>/MulMON/

create a new conda environment called "mulmon" and then activate it:

conda env create -f ./conda-env-spec.yml  
conda activate mulmon
  1. Install a gpu-supported PyTorch (tested with PyTorch 1.1, 1.2 and 1.7). It is very likely that there exists a PyTorch installer that is compatible with both your CUDA and this code. Go find it on PyTorch official site, and install it with one line of command.

  2. Install additional packages:

pip install tensorboard  
pip install scikit-image

If pytorch <=1.2 is used, you will also need to execute: pip install tensorboardX and import it in the ./trainer/base_trainer.py file. This can be done by commenting the 4th line AND uncommenting the 5th line of that file.

Data

  • Data structure (important):
    We use a data structure as follows:

    <YOUR-PATH>                                          
        ├── ...
        └── mulmon_datasets
              ├── clevr                                   # place your own CLEVR-MV under this directory if you go the fun way
              │    ├── ...
              │    ├── clevr_mv            
              │    │    └── ... (omit)                    # see clevr_<xxx> for subdirectory details
              │    ├── clevr_aug           
              │    │    └── ... (omit)                    # see clevr_<xxx> for subdirectory details
              │    └── clevr_<xxx>
              │         ├── ...
              │         ├── data                          # contains a list of scene files
              │         │    ├── CLEVR_new_#.npy          # one .npy --> one scene sample
              │         │    ├── CLEVR_new_#.npy       
              │         │    └── ...
              │         ├── clevr_<xxx>_train.json        # meta information of the training scenes
              │         └── clevr_<xxx>_test.json         # meta information of the testing scenes  
              └── GQN  
                   ├── ...
                   └── gqn-jaco                 
                        ├── gqn_jaco_train.h5
                        └── gqn_jaco_test.h5
    

    We recommend one to get the necessary data folders ready before downloading/generating the data files:

    mkdir <YOUR-PATH>/mulmon_datasets  
    mkdir <YOUR-PATH>/mulmon_datasets/clevr  
    mkdir <YOUR-PATH>/mulmon_datasets/GQN
    
  • Get Datasets

    • Easy way:
      Download our datasets:

      • clevr_mv.tar.gz and place it under the <YOUR-PATH>/mulmon_datasets/clevr/ directory (~1.8GB when extracted).
      • clevr_aug.tar.gz and place it under the <YOUR-PATH>/mulmon_datasets/clevr/ directory (~3.8GB when extracted).
      • gqn_jaco.tar.gz and place it under the <YOUR-PATH>/mulmon_datasets/GQN/ directory (~3.2GB when extracted).

      and extract them in places. For example, the command for extracting clevr_mv.tar.gz:

      tar -zxvf <YOUR-PATH>/mulmon_datasets/clevr/clevr_mv.tar.gz -C <YOUR-PATH>/mulmon_datasets/clevr/
      

      Note that: 1) we used only a subset of the DeepMind GQN-Jaco dataset, more available at deepmind/gqn-datasets, and 2) the published clevr_aug dataset differs slightly from the CLE-Aug used in the paper---we added more shapes (such as dolphins) into the dataset to make the dataset more interesting (also more complex).

    • Fun way :
      Customise your own multi-view CLEVR data. (available soon...)

Pre-trained models

Download the pretrained models (← click) and place it under `MulMON/', i.e. the root directory of this repository, then extract it by executing: tar -zxvf ./logs.tar.gz. Note that some of them are slightly under-trained, so one could train them further to achieve better results (How to train?).

Usage

Configure data path
To run the code, the data path, i.e. the <YOUR-PATH> in a script, needs to be correctly configured. For example, we store the MulMON dataset folder mulmon_datasets in ../myDatasets/, to train a MulMON on GQN-Jaco dataset using a single GPU, the 4th line of the ./scripts/train_jaco.sh script should look like: data_path=../myDatasets/mulmon_datasets/GQN.

  • Demo (Environment Test)
    Before running the below code, make sure the pretrained models are downloaded and saved first:

    . scripts/demo.sh  
    

    Check ./logs folder for the generated demos.

    • Notes for disentanglement demos: we randomly pick one object for each scene to create the disentanglement demo, so for scene samples where an empty object slot is picked, you won't see any object manipulation effect in the corresponding gifs (especially for the GQN-Jaco scenes). To create a demo like the shown one, one needs to specify (hard-coding) an object slot of interest and traverse informative latent dimensions (as some dimensions are redundant---capture no object property).
  • Train

    • On a single gpu (e.g. using the GQN-Jaco dataset):
    . scripts/train_jaco.sh  
    
    • On multiple GPUs (e.g. using the GQN-Jaco dataset):
    . scripts/train_jaco_parallel.sh  
    
    • To resume training from a stopped session, i.e. saved weights checkpoint-epoch<#number>.pth, simply append a flag --resume_epoch <#number> to one of the flags in the script files.
      For example, to resume previous training (saved as checkpoint-epoch2000.pth) on GQN-Jaco data, we just need to reconfigure the 10th line of the ./scripts/train_jaco.sh as:
      --input_dir ${data_path} --output_dir ${log_path} --resume_epoch 2000 \.
  • Evaluation

    • On a single gpu (e.g. using the Clevr_MV dataset):
    . scripts/eval_clevr.sh  
    
    • Here is a list of imporant evaluation settings which one might wants to play with
      --resume_epoch specify a model to evaluate --test_batch how many batches of test data one uses for evaluation.
      --vis_batch how many batches of output one visualises (save) while evaluation. (note: <= --test_batch)
      --analyse_batch how many batches of latent codes one saves for a post analysis, e.g. disentanglement. (note: <= --test_batch)
      --eval_all (boolean) set True for all [--eval_recon, --eval_seg, --eval_qry_obs, --eval_qry_seg] items, one could also use each of the four independently.
      --eval_dist (boolean) save latent codes for disentanglement analysis. (note: not controlled by --eval_all)
    • For the disentanglement evaluation, run the scripts/eval_clevr.sh script with --eval_dist flag set to True and set the --analyse_batch variable (which controls how many scenes of latent codes one wants to analyse) to be greater than 0. This saves the ouptut latent codes and ground-truth information that allows you to conduct disentanglement quantification using the QEDR framework.
    • You might observe that the evaluation results on the CLE-Aug dataset differ form those on the original paper, this is because the CLE-Aug here is slightly different the one we used for the paper (see more details).

Contact

We constantly respond to the raised ''issues'' in terms of running the code. For further inquiries and discussions (e.g. questions about the paper), email: [email protected].

Cite

Please cite our paper if you find this code useful.

@inproceedings{nanbo2020mulmon,
  title={Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views},
  author={Nanbo, Li and Eastwood, Cian and Fisher, Robert B},
  booktitle={Advances in Neural Information Processing Systems},
  year={2020}
}
Owner
NanboLi
PhD Student, University of Edinburgh
NanboLi
Source code and Dataset creation for the paper "Neural Symbolic Regression That Scales"

NeuralSymbolicRegressionThatScales Pytorch implementation and pretrained models for the paper "Neural Symbolic Regression That Scales", presented at I

35 Nov 25, 2022
"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Inductive entity representations from text via link prediction This repository contains the code used for the experiments in the paper "Inductive enti

Daniel Daza 45 Jan 09, 2023
This repository contains the map content ontology used in narrative cartography

Narrative-cartography-ontology This repository contains the map content ontology used in narrative cartography, which is associated with a submission

Weiming Huang 0 Oct 31, 2021
This MVP data web app uses the Streamlit framework and Facebook's Prophet forecasting package to generate a dynamic forecast from your own data.

📈 Automated Time Series Forecasting Background: This MVP data web app uses the Streamlit framework and Facebook's Prophet forecasting package to gene

Zach Renwick 42 Jan 04, 2023
OMAMO: orthology-based model organism selection

OMAMO: orthology-based model organism selection OMAMO is a tool that suggests the best model organism to study a biological process based on orthologo

Dessimoz Lab 5 Apr 22, 2022
DECAF: Deep Extreme Classification with Label Features

DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain

46 Nov 06, 2022
Code for the ICCV2021 paper "Personalized Image Semantic Segmentation"

PSS: Personalized Image Semantic Segmentation Paper PSS: Personalized Image Semantic Segmentation Yu Zhang, Chang-Bin Zhang, Peng-Tao Jiang, Ming-Ming

张宇 15 Jul 09, 2022
Deep Illuminator is a data augmentation tool designed for image relighting. It can be used to easily and efficiently generate a wide range of illumination variants of a single image.

Deep Illuminator Deep Illuminator is a data augmentation tool designed for image relighting. It can be used to easily and efficiently generate a wide

George Chogovadze 52 Nov 29, 2022
Spatial Single-Cell Analysis Toolkit

Single-Cell Image Analysis Package Scimap is a scalable toolkit for analyzing spatial molecular data. The underlying framework is generalizable to spa

Laboratory of Systems Pharmacology @ Harvard 30 Nov 08, 2022
Visual Adversarial Imitation Learning using Variational Models (VMAIL)

Visual Adversarial Imitation Learning using Variational Models (VMAIL) This is the official implementation of the NeurIPS 2021 paper. Project website

14 Nov 18, 2022
Streamlit Tutorial (ex: stock price dashboard, cartoon-stylegan, vqgan-clip, stylemixing, styleclip, sefa)

Streamlit Tutorials Install pip install streamlit Run cd [directory] streamlit run app.py --server.address 0.0.0.0 --server.port [your port] # http:/

Jihye Back 30 Jan 06, 2023
The first dataset of composite images with rationality score indicating whether the object placement in a composite image is reasonable.

Object-Placement-Assessment-Dataset-OPA Object-Placement-Assessment (OPA) is to verify whether a composite image is plausible in terms of the object p

BCMI 53 Nov 15, 2022
Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

Official implementation for paper "Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR"

Ziyue Feng 72 Dec 09, 2022
Database Reasoning Over Text project for ACL paper

Database Reasoning over Text This repository contains the code for the Database Reasoning Over Text paper, to appear at ACL2021. Work is performed in

Facebook Research 320 Dec 12, 2022
Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

SemCo The official pytorch implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

42 Nov 14, 2022
Efficient Sparse Attacks on Videos using Reinforcement Learning

EARL This repository provides a simple implementation of the work "Efficient Sparse Attacks on Videos using Reinforcement Learning" Example: Demo: Her

12 Dec 05, 2021
CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation

CSKG: The CommonSense Knowledge Graph CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation: AT

USC ISI I2 85 Dec 12, 2022
A fast and easy to use, moddable, Python based Minecraft server!

PyMine PyMine - The fastest, easiest to use, Python-based Minecraft Server! Features Note: This list is not always up to date, and doesn't contain all

PyMine 144 Dec 30, 2022
My freqtrade strategies

My freqtrade-strategies Hi there! This is repo for my freqtrade-strategies. My name is Ilya Zelenchuk, I'm a lecturer at the SPbU university (https://

171 Dec 05, 2022
[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

RegionProxy Figure 2. Performance vs. GFLOPs on ADE20K val split. Semantic Segmentation by Early Region Proxy Yifan Zhang, Bo Pang, Cewu Lu CVPR 2022

Yifan 54 Nov 29, 2022