ShapeGlot: Learning Language for Shape Differentiation

Overview

ShapeGlot: Learning Language for Shape Differentiation

Created by Panos Achlioptas, Judy Fan, Robert X.D. Hawkins, Noah D. Goodman, Leonidas J. Guibas.

representative

Introduction

This work is based on our ICCV-2019 paper. There, we proposed speaker & listener neural models that reason and differentiate objects according to their shape via language (hence the term shape--glot). These models can operate on 2D images and/or 3D point-clouds and do learn about natural properties of shapes, including the part-based compositionality of 3D objects, from language alone. The latter fact, makes them remarkably robust, enabling a plethora of zero-shot-transfer learning applications. You can check our project's webpage for a quick introduction and produced results.

Dependencies

Main Requirements:

Our code has been tested with Python 3.6.9, Pytorch 1.3.1, CUDA 10.0 on Ubuntu 14.04.

Installation

Clone the source code of this repository and pip install it inside your (virtual) environment.

git clone https://github.com/optas/shapeglot
cd shapeglot
pip install -e .

Data Set

We provide 78,782 utterances referring to a ShapeNet chair that was contrasted against two distractor chairs via the reference game described in our accompanying paper (dataset termed as ChairsInContext). We further provide the data used in the Zero-Shot experiments which include 300 images of real-world chairs, and 1200 referential utterances for ShapeNet lamps & tables & sofas, and 400 utterances describing ModelNet beds. Last, we include image-based (VGG-16) and point-cloud-based (PC-AE) pretrained features for all ShapeNet chairs to facilitate the training of the neural speakers and listeners.

To download the data (~232 MB) please run the following commands. Notice, that you first need to accept the Terms Of Use here. Upon review we will email to you the necessary link that you need to put inside the desingated location of the download_data.sh file.

cd shapeglot/
./download_data.sh

The downloaded data will be stored in shapeglot/data

Usage

To easily expose the main functionalities of our paper, we prepared some simple, instructional notebooks.

  1. To tokenize, prepare and visualize the chairsInContext dataset, please look/run:
    shapeglot/notebooks/prepare_chairs_in_context_data.ipynb
  1. To train a neural listener (only ~10 minutes on a single modern GPU):
    shapeglot/notebooks/train_listener.ipynb

Note: This repo contains limited functionality compared to what was presented in the paper. This is because our original (much heavier) implementation is in low-level TensorFlow and python 2.7. If you need more functionality (e.g. pragmatic-speakers) and you are OK with Tensorflow, please email [email protected] .

Citation

If you find our work useful in your research, please consider citing:

@article{shapeglot,
  title={ShapeGlot: Learning Language for Shape Differentiation},
  author={Achlioptas, Panos and Fan, Judy and Hawkins, Robert X. D. and Goodman, Noah D. and Guibas, Leonidas J.},
  journal={CoRR},
  volume={abs/1905.02925},
  year={2019}
}

License

This provided code is licensed under the terms of the MIT license (see LICENSE for details).

Owner
Eth brownie struct encoding example

eth-brownie struct encoding example Overview This repository contains an example of encoding a struct, so that it can be used in a function call, usin

Ittai Svidler 2 Mar 04, 2022
LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

LWCC: A LightWeight Crowd Counting library for Python LWCC is a lightweight crowd counting framework for Python. It wraps four state-of-the-art models

Matija Teršek 39 Dec 28, 2022
Demo for Real-time RGBD-based Extended Body Pose Estimation paper

Real-time RGBD-based Extended Body Pose Estimation This repository is a real-time demo for our paper that was published at WACV 2021 conference The ou

Renat Bashirov 118 Dec 26, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 05, 2022
CVNets: A library for training computer vision networks

CVNets: A library for training computer vision networks This repository contains the source code for training computer vision models. Specifically, it

Apple 1.1k Jan 03, 2023
This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit

BMW Semantic Segmentation GPU/CPU Inference API This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit. The train

BMW TechOffice MUNICH 56 Nov 24, 2022
Yas CRNN model training - Yet Another Genshin Impact Scanner

Yas-Train Yet Another Genshin Impact Scanner 又一个原神圣遗物导出器 介绍 该仓库为 Yas 的模型训练程序 相关资料 MobileNetV3 CRNN 使用 假设你会设置基本的pytorch环境。 生成数据集 python main.py gen 训练

wormtql 18 Jan 08, 2023
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

DART Implementation for ICLR2022 paper Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners. Environment

ZJUNLP 83 Dec 27, 2022
Reimplementation of Dynamic Multi-scale filters for Semantic Segmentation.

Paddle implementation of Dynamic Multi-scale filters for Semantic Segmentation.

Hongqiang.Wang 2 Nov 01, 2021
MQBench Quantization Aware Training with PyTorch

MQBench Quantization Aware Training with PyTorch I am using MQBench(Model Quantization Benchmark)(http://mqbench.tech/) to quantize the model for depl

Ling Zhang 29 Nov 18, 2022
A Python package for generating concise, high-quality summaries of a probability distribution

GoodPoints A Python package for generating concise, high-quality summaries of a probability distribution GoodPoints is a collection of tools for compr

Microsoft 28 Oct 10, 2022
Fuzzer for Linux Kernel Drivers

difuze: Fuzzer for Linux Kernel Drivers This repo contains all the sources (including setup scripts), you need to get difuze up and running. Tested on

seclab 344 Dec 27, 2022
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.

Microsoft 7.6k Jan 01, 2023
Testing and Estimation of structural breaks in Stata

xtbreak estimating and testing for many known and unknown structural breaks in time series and panel data. For an overview of xtbreak test see xtbreak

Jan Ditzen 13 Jun 19, 2022
Classifies galaxy morphology with Bayesian CNN

Zoobot Zoobot classifies galaxy morphology with deep learning. This code will let you: Reproduce and improve the Galaxy Zoo DECaLS automated classific

Mike Walmsley 39 Dec 20, 2022
Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

Henghui Ding 143 Dec 23, 2022
Fairness Metrics: All you need to know

Fairness Metrics: All you need to know Testing machine learning software for ethical bias has become a pressing current concern. Recent research has p

Anonymous2020 1 Jan 17, 2022
FID calculation with proper image resizing and quantization steps

clean-fid: Fixing Inconsistencies in FID Project | Paper The FID calculation involves many steps that can produce inconsistencies in the final metric.

Gaurav Parmar 606 Jan 06, 2023
University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN

Music-Sentiment-Transfer University of Rochester 2021 Summer REU focusing on music sentiment transfer using CycleGAN Poster: Music Sentiment Transfer

Miles Sigel 2 Jan 24, 2022
Code for CVPR2019 paper《Unequal Training for Deep Face Recognition with Long Tailed Noisy Data》

Unequal-Training-for-Deep-Face-Recognition-with-Long-Tailed-Noisy-Data. This is the code of CVPR 2019 paper《Unequal Training for Deep Face Recognition

Zhong Yaoyao 68 Jan 07, 2023