TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Last update: Dec 16, 2022

Related tags

Deep Learning TransFGU

Overview

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin

[Preprint]

Getting Started

Create the environment

# create conda env
conda create -n TransFGU python=3.8
# activate conda env
conda activate TransFGU
# install pytorch
conda install pytorch=1.8 torchvision cudatoolkit=10.1
# install other dependencies
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html
pip install -r requirements.txt

Dataset Preparation

MS-COCO Dataset: Download the trainset, validset, annotations and the json files, place the extracted files into root/data/MSCOCO.
PascalVOC Dataset: Download training/validation data, place the extracted files into root/data/PascalVOC.
Cityscapes Dataset: Download leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip, place the extracted files into root/data/Cityscapes.
LIP Dataset: Download TrainVal_images.zip and TrainVal_parsing_annotations.zip, place the extracted files into root/data/LIP.

the structure of dataset folders should be as follow:

data/
    │── MSCOCO/
    │     ├── images/
    │     │     ├── train2017/
    │     │     └── val2017/
    │     └── annotations/
    │           ├── train2017/
    │           ├── val2017/
    │           ├── instances_train2017.json
    │           └── instances_val2017.json
    │── Cityscapes/
    │     ├── leftImg8bit/
    │     │     ├── train/
    │     │     │       ├── aachen
    │     │     │       └── ...
    │     │     └──── val/
    │     │             ├── frankfurt
    │     │             └── ...
    │     └── gtFine/
    │           ├── train/
    │           │       ├── aachen
    │           │       └── ...
    │           └──── val/
    │                   ├── frankfurt
    │                   └── ...
    │── PascalVOC/
    │     ├── JPEGImages/
    │     ├── SegmentationClass/
    │     └── ImageSets/
    │           └── Segmentation/
    │                   ├── train.txt
    │                   └── val.txt
    └── LIP/
          ├── train_images/
          ├── train_segmentations/
          ├── val_images/
          ├── val_segmentations/
          ├── train_id.txt
          └── val_id.txt

Model download

please download the pretrained dino model (deit small 8x8), then place it into root/weight/dino/
download trained model from Google Drive or Baidu Netdisk (code:1118), then place them into root/weight/trained/

Name	mIoU	Pixel Accuracy	Model
COCOStuff-27	16.19	44.52	Google Drive
COCOStuff-171	11.93	34.32	Google Drive
COCO-80	12.69	64.31	Google Drive
Cityscapes	16.83	77.92	Google Drive
Pascal-VOC	37.15	83.59	Google Drive
LIP-5	25.16	65.76	Google Drive
LIP-16	15.49	60.08	Google Drive
LIP-19	12.24	42.52	Google Drive

Train and Evaluate Our Method

To train and evaluate our method on different datasets under desired granularity level, please follow the instructions here.

Citation

If you find our work useful in your research, please consider citing:

@article{yin2021transfgu,
  title={TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation},
  author={Zhaoyun, Yin and Pichao, Wang and Fan, Wang and Xianzhe, Xu and Hanling, Zhang and Hao, Li and Rong, Jin},
  journal={arXiv preprint arXiv:2112.01515},
  year={2021}
}

LICENSE

The code is released under the MIT license.

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Related tags

Overview

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Getting Started

Dataset Preparation

Model download

Train and Evaluate Our Method

Citation

LICENSE

Copyright

Owner

DamoCV

Human motion synthesis using Unity3D

This code is an implementation for Singing TTS.

TensorFlow tutorials and best practices.

SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages

Baseline of DCASE 2020 task 4

PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection?

Official implementation of "Implicit Neural Representations with Periodic Activation Functions"

Collection of generative models in Pytorch version.

[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Scripts used to make and evaluate OpenAlex's concept tagging model

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

A working implementation of the Categorical DQN (Distributional RL).

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Streamlit tool to explore coco datasets

You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks.

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

Tensorflow implementation of Character-Aware Neural Language Models.