This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Related tags

Deep Learningclipseg
Overview

Prompt-Based Multi-Modal Image Segmentation

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

drawing

The systems allows to create segmentation models without training based on:

  • An arbitrary text query
  • Or an image with a mask highlighting stuff or an object.

Quick Start

In the Quickstart.ipynb notebook we provide the code for using a pre-trained CLIPSeg model. It can also be used interactively using MyBinder (please note that the VM does not use a GPU, thus inference takes a few seconds).

Dependencies

This code base depends on pytorch, torchvision and clip (pip install git+https://github.com/openai/CLIP.git). Additional dependencies are hidden for double blind review.

Datasets

  • PhraseCut and PhraseCutPlus: Referring expression dataset
  • PFEPascalWrapper: Wrapper class for PFENet's Pascal-5i implementation
  • PascalZeroShot: Wrapper class for PascalZeroShot
  • COCOWrapper: Wrapper class for COCO.

Models

  • CLIPDensePredT: CLIPSeg model with transformer-based decoder.
  • ViTDensePredT: CLIPSeg model with transformer-based decoder.

Third Party Dependencies

For some of the datasets third party dependencies are required. Run the following commands in the third_party folder.

git clone https://github.com/cvlab-yonsei/JoEm
git clone https://github.com/Jia-Research-Lab/PFENet.git
git clone https://github.com/ChenyunWu/PhraseCutDataset.git
git clone https://github.com/juhongm999/hsnet.git

Weights

Training

See the experiment folder for yaml definitions of the training configurations. The training code is in experiment_setup.py.

Usage of PFENet Wrappers

In order to use the dataset and model wrappers for PFENet, the PFENet repository needs to be cloned to the root folder. git clone https://github.com/Jia-Research-Lab/PFENet.git

Citation

@article{lueddecke21
    title={Prompt-Based Multi-Modal Image Segmentation},
    author={Timo Lüddecke and Alexander Ecker},
    journal={arXiv preprint arXiv:2112.10003},
    year={2021}
}
Owner
Timo Lüddecke
Postdoc @ecker-lab
Timo Lüddecke
PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

PantheonRL is a package for training and testing multi-agent reinforcement learning environments. PantheonRL supports cross-play, fine-tuning, ad-hoc coordination, and more.

Stanford Intelligent and Interactive Autonomous Systems Group 57 Dec 28, 2022
METER: Multimodal End-to-end TransformER

METER Code and pre-trained models will be publicized soon. Citation @article{dou2021meter, title={An Empirical Study of Training End-to-End Vision-a

Zi-Yi Dou 257 Jan 06, 2023
NLMpy - A Python package to create neutral landscape models

NLMpy is a Python package for the creation of neutral landscape models that are widely used by landscape ecologists to model ecological patterns

Manaaki Whenua – Landcare Research 1 Oct 08, 2022
Data Preparation, Processing, and Visualization for MoVi Data

MoVi-Toolbox Data Preparation, Processing, and Visualization for MoVi Data, https://www.biomotionlab.ca/movi/ MoVi is a large multipurpose dataset of

Saeed Ghorbani 51 Nov 27, 2022
Pre-Training 3D Point Cloud Transformers with Masked Point Modeling

Point-BERT: Pre-Training 3D Point Cloud Transformers with Masked Point Modeling Created by Xumin Yu*, Lulu Tang*, Yongming Rao*, Tiejun Huang, Jie Zho

Lulu Tang 306 Jan 06, 2023
Neural machine translation between the writings of Shakespeare and modern English using TensorFlow

Shakespeare translations using TensorFlow This is an example of using the new Google's TensorFlow library on monolingual translation going from modern

Motoki Wu 245 Dec 28, 2022
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer Summary Explorer is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multipl

Webis 42 Aug 14, 2022
🔅 Shapash makes Machine Learning models transparent and understandable by everyone

🎉 What's new ? Version New Feature Description Tutorial 1.6.x Explainability Quality Metrics To help increase confidence in explainability methods, y

MAIF 2.1k Dec 27, 2022
Parametric Contrastive Learning (ICCV2021)

Parametric-Contrastive-Learning This repository contains the implementation code for ICCV2021 paper: Parametric Contrastive Learning (https://arxiv.or

DV Lab 156 Dec 21, 2022
[NeurIPS 2020] This project provides a strong single-stage baseline for Long-Tailed Classification, Detection, and Instance Segmentation (LVIS).

A Strong Single-Stage Baseline for Long-Tailed Problems This project provides a strong single-stage baseline for Long-Tailed Classification (under Ima

Kaihua Tang 514 Dec 23, 2022
Testbed of AI Systems Quality Management

qunomon Description A testbed for testing and managing AI system qualities. Demo Sorry. Not deployment public server at alpha version. Requirement Ins

AIST AIRC 15 Nov 27, 2021
FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning (FedML) developed and maintained by Scaleout Systems. FEDn enables highly scalable cross-silo and cr

Scaleout 75 Nov 09, 2022
Chinese named entity recognization with BiLSTM using Keras

Chinese named entity recognization (Bilstm with Keras) Project Structure ./ ├── README.md ├── data │   ├── README.md │   ├── data 数据集 │   │   ├─

1 Dec 17, 2021
The pytorch implementation of DG-Font: Deformable Generative Networks for Unsupervised Font Generation

DG-Font: Deformable Generative Networks for Unsupervised Font Generation The source code for 'DG-Font: Deformable Generative Networks for Unsupervised

130 Dec 05, 2022
Anomaly detection in multi-agent trajectories: Code for training, evaluation and the OpenAI highway simulation.

Anomaly Detection in Multi-Agent Trajectories for Automated Driving This is the official project page including the paper, code, simulation, baseline

12 Dec 02, 2022
2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

TFill arXiv | Project This repository implements the training, testing and editing tools for "Bridging Global Context Interactions for High-Fidelity I

Chuanxia Zheng 111 Jan 08, 2023
TLDR: Twin Learning for Dimensionality Reduction

TLDR (Twin Learning for Dimensionality Reduction) is an unsupervised dimensionality reduction method that combines neighborhood embedding learning with the simplicity and effectiveness of recent self

NAVER 105 Dec 28, 2022
YOLOv4-v3 Training Automation API for Linux

This repository allows you to get started with training a state-of-the-art Deep Learning model with little to no configuration needed! You provide your labeled dataset or label your dataset using our

BMW TechOffice MUNICH 626 Dec 31, 2022
Global-Local Attention for Emotion Recognition

Global-Local Attention for Emotion Recognition Requirements Python 3 Install tensorflow (or tensorflow-gpu) = 2.0.0 Install some other packages pip i

Minh Nhat Le 15 Apr 21, 2022
Toward Spatially Unbiased Generative Models (ICCV 2021)

Toward Spatially Unbiased Generative Models Implementation of Toward Spatially Unbiased Generative Models (ICCV 2021) Overview Recent image generation

Jooyoung Choi 88 Dec 01, 2022