This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

Overview

DCL-PyTorch

Pytorch implementation for the Dynamic Concept Learner (DCL). More details can be found at the project page.

Framework

Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
Zhenfang Chen, Jiayuan Mao, Jiajun Wu, Kwan-Yee K. Wong, Joshua B. Tenenbaum, and Chuang Gan

Prerequisites

  • Python 3
  • PyTorch 1.0 or higher, with NVIDIA CUDA Support
  • Other required python packages specified by requirements.txt. See the Installation.

Installation

Install Jacinle: Clone the package, and add the bin path to your global PATH environment variable:

git clone https://github.com/vacancy/Jacinle --recursive
export PATH=<path_to_jacinle>/bin:$PATH

Clone this repository:

git clone https://github.com/zfchenUnique/DCL-Release.git --recursive

Create a conda environment for NS-CL, and install the requirements. This includes the required python packages from both Jacinle NS-CL. Most of the required packages have been included in the built-in anaconda package:

Dataset preparation

  • Download videos, video annotation, questions and answers, and object proposals accordingly from the official website
  • Transform videos into ".png" frames with ffmpeg.
  • Organize the data as shown below.
    clevrer
    ├── annotation_00000-01000
    │   ├── annotation_00000.json
    │   ├── annotation_00001.json
    │   └── ...
    ├── ...
    ├── image_00000-01000
    │   │   ├── 1.png
    │   │   ├── 2.png
    │   │   └── ...
    │   └── ...
    ├── ...
    ├── questions
    │   ├── train.json
    │   ├── validation.json
    │   └── test.json
    ├── proposals
    │   ├── proposal_00000.json
    │   ├── proposal_00001.json
    │   └── ...
    

Fast Evaluation

    git clone https://github.com/zfchenUnique/clevrer_dynamic_propnet.git
    cd clevrer_dynamic_propnet
    sh ./scripts/eval_fast_release_v2.sh 0
   sh scripts/script_test_prp_clevrer_qa.sh 0

Step-by-step Training

  • Step 1: download the proposals from the region proposal network and extract object trajectories for train and val set by
   sh scripts/script_gen_tubes.sh
  • Step 2: train a concept learner with descriptive and explanatory questions for static concepts (i.e. color, shape and material)
   sh scripts/script_train_dcl_stage1.sh 0
  • Step 3: extract static attributes & refine object trajectories extract static attributes
   sh scripts/script_extract_attribute.sh

refine object trajectories

   sh scripts/script_gen_tubes_refine.sh
  • Step 4: extract predictive and counterfactual scenes by
    cd clevrer_dynamic_propnet
    sh ./scripts/train_tube_box_only.sh # train
    sh ./scripts/train_tube.sh # train
    sh ./scripts/eval_fast_release_v2.sh 0 # val
  • Step 5: train DCL with all questions and the refined trajectories
   sh scripts/script_train_dcl_stage2.sh 0

Generalization to CLEVRER-Grounding

    sh ./scripts/script_grounding.sh  0
    jac-crun 0 scripts/script_evaluate_grounding.py

Generalization to CLEVRER-Retrieval

    sh ./scripts/script_retrieval.sh  0
    jac-crun 0 scripts/script_evaluate_retrieval.py

Extension to Tower Blocks

    sh ./scripts/script_train_blocks.sh 0
  • Step 3: download the pretrain model from google drive and evaluate on Tower block QA
    sh ./scripts/script_eval_blocks.sh 0

Others

Citation

If you find this repo useful in your research, please consider citing:

@inproceedings{zfchen2021iclr,
    title={Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning},
    author={Chen, Zhenfang and Mao, Jiayuan and Wu, Jiajun and Wong, Kwan-Yee K and Tenenbaum, Joshua B. and Gan, Chuang},
    booktitle={International Conference on Learning Representations},
    year={2021}
    }
Owner
Zhenfang Chen
Keep it simple.
Zhenfang Chen
The fastai book, published as Jupyter Notebooks

English / Spanish / Korean / Chinese / Bengali / Indonesian The fastai book These notebooks cover an introduction to deep learning, fastai, and PyTorc

fast.ai 17k Jan 07, 2023
This repository contains the source code of an efficient 1D probabilistic model for music time analysis proposed in ICASSP2022 venue.

Jump Reward Inference for 1D Music Rhythmic State Spaces An implementation of the probablistic jump reward inference model for music rhythmic informat

Mojtaba Heydari 25 Dec 16, 2022
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Collie do

ShopRunner 96 Dec 29, 2022
Official code for Score-Based Generative Modeling through Stochastic Differential Equations

Score-Based Generative Modeling through Stochastic Differential Equations This repo contains the official implementation for the paper Score-Based Gen

Yang Song 818 Jan 06, 2023
Uses OpenCV and Python Code to detect a face on the screen

Simple-Face-Detection This code uses OpenCV and Python Code to detect a face on the screen. This serves as an example program. Important prerequisites

Denis Woolley (CreepyD) 1 Feb 12, 2022
Implementation of Shape and Electrostatic similarity metric in deepFMPO.

DeepFMPO v3D Code accompanying the paper "On the value of using 3D-shape and electrostatic similarities in deep generative methods". The paper can be

34 Nov 28, 2022
[CVPR2021] De-rendering the World's Revolutionary Artefacts

De-rendering the World's Revolutionary Artefacts Project Page | Video | Paper In CVPR 2021 Shangzhe Wu1,4, Ameesh Makadia4, Jiajun Wu2, Noah Snavely4,

49 Nov 06, 2022
optimization routines for hyperparameter tuning

Hyperopt: Distributed Hyperparameter Optimization Hyperopt is a Python library for serial and parallel optimization over awkward search spaces, which

Marc Claesen 398 Nov 09, 2022
Using OpenAI's CLIP to upscale and enhance images

CLIP Upscaler and Enhancer Using OpenAI's CLIP to upscale and enhance images Based on nshepperd's JAX CLIP Guided Diffusion v2.4 Sample Results Viewpo

Tripp Lyons 5 Jun 14, 2022
PyTorch implementation of the paper The Lottery Ticket Hypothesis for Object Recognition

LTH-ObjectRecognition The Lottery Ticket Hypothesis for Object Recognition Sharath Girish*, Shishira R Maiya*, Kamal Gupta, Hao Chen, Larry Davis, Abh

16 Feb 06, 2022
DeepLab-ResNet rebuilt in TensorFlow

DeepLab-ResNet-TensorFlow This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. Fr

Vladimir 1.2k Nov 04, 2022
Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

Shaping Visual Representations with Attributes for Few-Shot Learning This code implements the Shaping Visual Representations with Attributes for Few-S

chx_nju 9 Sep 01, 2022
PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

StARformer This repository contains the PyTorch implementation for our paper titled StARformer: Transformer with State-Action-Reward Representations.

Jinghuan Shang 14 Dec 09, 2022
The codes of paper 'Active-LATHE: An Active Learning Algorithm for Boosting the Error exponent for Learning Homogeneous Ising Trees'

Active-LATHE: An Active Learning Algorithm for Boosting the Error exponent for Learning Homogeneous Ising Trees This project contains the codes of pap

0 Apr 20, 2022
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Code to reproduce the results for Compositional Attention

Compositional-Attention This repository contains the official implementation for the paper Compositional Attention: Disentangling Search and Retrieval

Sarthak Mittal 58 Nov 30, 2022
Self-attentive task GAN for space domain awareness data augmentation.

SATGAN TODO: update the article URL once published. Article about this implemention The self-attentive task generative adversarial network (SATGAN) le

Nathan 2 Mar 24, 2022
Greedy Gaussian Segmentation

GGS Greedy Gaussian Segmentation (GGS) is a Python solver for efficiently segmenting multivariate time series data. For implementation details, please

Stanford University Convex Optimization Group 72 Dec 07, 2022
Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions"

Graph Convolution Simulator (GCS) Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions" Requirements: PyTor

yifan 10 Oct 18, 2022
This repository provides some of the code implemented and the data used for the work proposed in "A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems".

cluster-link-prediction This repository provides some of the code implemented and the data used for the work proposed in "A Cluster-Based Trip Predict

Bárbara 0 Dec 28, 2022