TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Last update: Dec 16, 2022

Related tags

Deep Learning TransFGU

Overview

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin

[Preprint]

Getting Started

Create the environment

# create conda env
conda create -n TransFGU python=3.8
# activate conda env
conda activate TransFGU
# install pytorch
conda install pytorch=1.8 torchvision cudatoolkit=10.1
# install other dependencies
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
pip install -r requirements.txt

Dataset Preparation

MS-COCO Dataset: Download the trainset, validset, annotations and the json files, place the extracted files into root/data/MSCOCO.
PascalVOC Dataset: Download training/validation data, place the extracted files into root/data/PascalVOC.
Cityscapes Dataset: Download leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip, place the extracted files into root/data/Cityscapes.
LIP Dataset: Download TrainVal_images.zip and TrainVal_parsing_annotations.zip, place the extracted files into root/data/LIP.

the structure of dataset folders should be as follow:

data/
    │── MSCOCO/
    │     ├── images/
    │     │     ├── train2017/
    │     │     └── val2017/
    │     └── annotations/
    │           ├── train2017/
    │           ├── val2017/
    │           ├── instances_train2017.json
    │           └── instances_val2017.json
    │── Cityscapes/
    │     ├── leftImg8bit/
    │     │     ├── train/
    │     │     │       ├── aachen
    │     │     │       └── ...
    │     │     └──── val/
    │     │             ├── frankfurt
    │     │             └── ...
    │     └── gtFine/
    │           ├── train/
    │           │       ├── aachen
    │           │       └── ...
    │           └──── val/
    │                   ├── frankfurt
    │                   └── ...
    │── PascalVOC/
    │     ├── JPEGImages/
    │     ├── SegmentationClass/
    │     └── ImageSets/
    │           └── Segmentation/
    │                   ├── train.txt
    │                   └── val.txt
    └── LIP/
          ├── train_images/
          ├── train_segmentations/
          ├── val_images/
          ├── val_segmentations/
          ├── train_id.txt
          └── val_id.txt

Model download

please download the pretrained dino model (deit small 8x8), then place it into root/weight/dino/
download trained model from Google Drive or Baidu Netdisk (code:1118), then place them into root/weight/trained/

Name	mIoU	Pixel Accuracy	Model
COCOStuff-27	16.19	44.52	Google Drive
COCOStuff-171	11.93	34.32	Google Drive
COCO-80	12.69	64.31	Google Drive
Cityscapes	16.83	77.92	Google Drive
Pascal-VOC	37.15	83.59	Google Drive
LIP-5	25.16	65.76	Google Drive
LIP-16	15.49	60.08	Google Drive
LIP-19	12.24	42.52	Google Drive

Train and Evaluate Our Method

To train and evaluate our method on different datasets under desired granularity level, please follow the instructions here.

Citation

If you find our work useful in your research, please consider citing:

@article{yin2021transfgu,
  title={TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation},
  author={Zhaoyun, Yin and Pichao, Wang and Fan, Wang and Xianzhe, Xu and Hanling, Zhang and Hao, Li and Rong, Jin},
  journal={arXiv preprint arXiv:2112.01515},
  year={2021}
}

LICENSE

The code is released under the MIT license.

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Related tags

Overview

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Getting Started

Dataset Preparation

Model download

Train and Evaluate Our Method

Citation

LICENSE

Copyright

Owner

DamoCV

Simulated garment dataset for virtual try-on

Codebase for Inducing Causal Structure for Interpretable Neural Networks

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

PointCNN: Convolution On X-Transformed Points (NeurIPS 2018)

Multi-View Radar Semantic Segmentation

Neural Module Network for VQA in Pytorch

MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images (ISBI 2021, MELBA 2021)

On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization

🐾 Semantic segmentation of paws from cute pet images (PyTorch)

This is 2nd term discrete maths project done by UCU students that uses backtracking to solve various problems.

Robustness via Cross-Domain Ensembles

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

(CVPR 2022) Energy-based Latent Aligner for Incremental Learning

ML-based medical imaging using Azure

BTC-Generator - BTC Generator With Python

This repository contains code to run experiments in the paper "Signal Strength and Noise Drive Feature Preference in CNN Image Classifiers."

Unbalanced Feature Transport for Exemplar-based Image Translation (CVPR 2021)

PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022