Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Last update: Dec 19, 2022

Related tags

Overview

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Abstract: We introduce a method that allows to automatically segment images into semantically meaningful regions without human supervision. Derived regions are consistent across different images and coincide with human-defined semantic classes on some datasets. In cases where semantic regions might be hard for human to define and consistently label, our method is still able to find meaningful and consistent semantic classes. In our work, we use pretrained StyleGAN2 generative model: clustering in the feature space of the generative model allows to discover semantic classes. Once classes are discovered, a synthetic dataset with generated images and corresponding segmentation masks can be created. After that a segmentation model is trained on the synthetic dataset and is able to generalize to real images. Additionally, by using CLIP we are able to use prompts defined in a natural language to discover some desired semantic classes. We test our method on publicly available datasets and show state-of-the-art results.

This repository contains the official Pytorch implementation of the following paper:

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP
Daniil Pakhomov, Sanchit Hira, Narayani Wagle, Kemar E. Green, Nassir Navab
https://arxiv.org/abs/2107.12518

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Related tags

Overview

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Owner

Daniil Pakhomov

source code the paper Fast and Robust Iterative Closet Point.

SSD-based Object Detection in PyTorch

Tensorflow 2.x implementation of Vision-Transformer model

[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression

Weight initialization schemes for PyTorch nn.Modules

A universal framework for learning timestamp-level representations of time series

Code and training data for our ECCV 2016 paper on Unsupervised Learning

Invariant Causal Prediction for Block MDPs

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

[CVPR'22] Official PyTorch Implementation of Collaborative Transformers for Grounded Situation Recognition

Medical image analysis framework merging ANTsPy and deep learning

CondNet: Conditional Classifier for Scene Segmentation

Pytorch implementation of ICASSP 2022 paper Attention Probe: Vision Transformer Distillation in the Wild

Computational Methods Course at UdeA. Forked and size reduced from:

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

A diff tool for language models

Dilated RNNs in pytorch

Neural Message Passing for Computer Vision