The official homepage of the (outdated) COCO-Stuff 10K dataset.

Overview

COCO-Stuff 10K dataset v1.1 (outdated)

Holger Caesar, Jasper Uijlings, Vittorio Ferrari

Overview

COCO-Stuff example annotations

Welcome to official homepage of the COCO-Stuff [1] dataset. COCO-Stuff augments the popular COCO [2] dataset with pixel-level stuff annotations. These annotations can be used for scene understanding tasks like semantic segmentation, object detection and image captioning.

Overview

Highlights

  • 10,000 complex images from COCO [2]
  • Dense pixel-level annotations
  • 91 thing and 91 stuff classes
  • Instance-level annotations for things from COCO [2]
  • Complex spatial context between stuff and things
  • 5 captions per image from COCO [2]

Updates

  • 11 Jul 2017: Added working Deeplab models for Resnet and VGG
  • 06 Apr 2017: Dataset version 1.1: Modified label indices
  • 31 Mar 2017: Published annotations in JSON format
  • 09 Mar 2017: Added label hierarchy scripts
  • 08 Mar 2017: Corrections to table 2 in arXiv paper [1]
  • 10 Feb 2017: Added script to extract SLICO superpixels in annotation tool
  • 12 Dec 2016: Dataset version 1.0 and arXiv paper [1] released

Results

The current release of COCO-Stuff-10K publishes both the training and test annotations and users report their performance individually. We invite users to report their results to us to complement this table. In the near future we will extend COCO-Stuff to all images in COCO and organize an official challenge where the test annotations will only be known to the organizers.

For the updated table please click here.

Method Source Class-average accuracy Global accuracy Mean IOU FW IOU
FCN-16s [3] [1] 34.0% 52.0% 22.7% -
Deeplab VGG-16 (no CRF) [4] [1] 38.1% 57.8% 26.9% -
FCN-8s [3] [6] 38.5% 60.4% 27.2% -
DAG-RNN + CRF [6] [6] 42.8% 63.0% 31.2% -
OHE + DC + FCN+ [5] [5] 45.8% 66.6% 34.3% 51.2%
Deeplab ResNet (no CRF) [4] - 45.5% 65.1% 34.4% 50.4%
W2V + DC + FCN+ [5] [5] 45.1% 66.1% 34.7% 51.0%

Dataset

Filename Description Size
cocostuff-10k-v1.1.zip COCO-Stuff dataset v. 1.1, images and annotations 2.0 GB
cocostuff-10k-v1.1.json COCO-Stuff dataset v. 1.1, annotations in JSON format (optional) 62.3 MB
cocostuff-labels.txt A list of the 1+91+91 classes in COCO-Stuff 2.3 KB
cocostuff-readme.txt This document 6.5 KB
Older files
cocostuff-10k-v1.0.zip COCO-Stuff dataset version 1.0, including images and annotations 2.6 GB

Usage

To use the COCO-Stuff dataset, please follow these steps:

  1. Download or clone this repository using git: git clone https://github.com/nightrome/cocostuff10k.git
  2. Open the dataset folder in your shell: cd cocostuff10k
  3. If you have Matlab, run the following commands:
  • Add the code folder to your Matlab path: startup();
  • Run the demo script in Matlab demo_cocoStuff();
  • The script displays an image, its thing, stuff and thing+stuff annotations, as well as the image captions.
  1. Alternatively run the following Linux commands or manually download and unpack the dataset:
  • wget --directory-prefix=downloads http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/cocostuff-10k-v1.1.zip
  • unzip downloads/cocostuff-10k-v1.1.zip -d dataset/

MAT Format

The COCO-Stuff annotations are stored in separate .mat files per image. These files follow the same format as used by Tighe et al.. Each file contains the following fields:

  • S: The pixel-wise label map of size [height x width].
  • names: The names of the thing and stuff classes in COCO-Stuff. For more details see Label Names & Indices.
  • captions: Image captions from [2] that are annotated by 5 distinct humans on average.
  • regionMapStuff: A map of the same size as S that contains the indices for the approx. 1000 regions (superpixels) used to annotate the image.
  • regionLabelsStuff: A list of the stuff labels for each superpixel. The indices in regionMapStuff correspond to the entries in regionLabelsStuff.

JSON Format

Alternatively, we also provide stuff and thing annotations in the COCO-style JSON format. The thing annotations are copied from COCO. We encode every stuff class present in an image as a single annotation using the RLE encoding format of COCO. To get the annotations:

  • Either download them: wget --directory-prefix=dataset/annotations-json http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/cocostuff-10k-v1.1.json
  • Or extract them from the .mat file annotations using this Python script.

Label Names & Indices

To be compatible with COCO, version 1.1 of COCO-Stuff has 91 thing classes (1-91), 91 stuff classes (92-182) and 1 class "unlabeled" (0). Note that 11 of the thing classes from COCO 2015 do not have any segmentation annotations. The classes desk, door and mirror could be either stuff or things and therefore occur in both COCO and COCO-Stuff. To avoid confusion we add the suffix "-stuff" to those classes in COCO-Stuff. The full list of classes can be found here.

The older version 1.0 of COCO-Stuff had 80 thing classes (2-81), 91 stuff classes (82-172) and 1 class "unlabeled" (1).

Label Hierarchy

The hierarchy of labels is stored in CocoStuffClasses. To visualize it, run CocoStuffClasses.showClassHierarchyStuffThings() (also available for just stuff and just thing classes) in Matlab. The output should look similar to the following figure: COCO-Stuff label hierarchy

Semantic Segmentation Models

To encourage further research of stuff and things we provide the trained semantic segmentation model (see Sect. 4.4 in [1]).

DeepLab VGG-16

Use the following steps to download and setup the DeepLab [4] semantic segmentation model trained on COCO-Stuff. It requires deeplab-public-ver2, which is built on Caffe:

  1. Install Cuda. I recommend version 7.0. For version 8.0 you will need to apply the fix described here in step 3.
  2. Download deeplab-public-ver2: git submodule update --init models/deeplab/deeplab-public-ver2
  3. Compile and configure deeplab-public-ver2 following the author's instructions. Depending on your system setup you might have to install additional packages, but a minimum setup could look like this:
  • cd models/deeplab/deeplab-public-ver2
  • cp Makefile.config.example Makefile.config
  • Optionally add CuDNN support or modify library paths in the Makefile.
  • make all -j8
  • cd ../..
  1. Configure the COCO-Stuff dataset:
  • Create folders: mkdir models/deeplab/deeplab-public-ver2/cocostuff && mkdir models/deeplab/deeplab-public-ver2/cocostuff/data
  • Create a symbolic link to the images: cd models/deeplab/cocostuff/data && ln -s ../../../../dataset/images images && cd ../../../..
  • Convert the annotations by running the Matlab script: startup(); convertAnnotationsDeeplab();
  1. Download the base VGG-16 model:
  • wget --directory-prefix=models/deeplab/cocostuff/model/deeplabv2_vgg16 http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/deeplabv2_vgg16_init.caffemodel
  1. Run cd models/deeplab && ./run_cocostuff_vgg16.sh to train and test the network on COCO-Stuff.

DeepLab ResNet 101

The default Deeplab model performs center crops of size 513*513 pixels of an image, if any side is larger than that. Since we want to segment the whole image at test time, we choose to resize the images to 513x513, perform the semantic segmentation and then rescale it elsewhere. Note that without the final step, the performance might differ slightly.

  1. Follow steps 1-4 of the DeepLab VGG-16 section above.
  2. Download the base ResNet model:
  • wget --directory-prefix=models/deeplab/cocostuff/model/deeplabv2_resnet101 http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/deeplabv2_resnet101_init.caffemodel
  1. Rescale the images and annotations:
  • cd models/deeplab
  • python rescaleImages.py
  • python rescaleAnnotations.py
  1. Run ./run_cocostuff_resnet101.sh to train and test the network on COCO-Stuff.

Annotation Tool

In [1] we present a simple and efficient stuff annotation tool which was used to annotate the COCO-Stuff dataset. It uses a paintbrush tool to annotate SLICO superpixels (precomputed using the code of Achanta et al.) with stuff labels. These annotations are overlaid with the existing pixel-level thing annotations from COCO. We provide a basic version of our annotation tool:

  • Prepare the required data:
    • Specify a username in annotator/data/input/user.txt.
    • Create a list of images in annotator/data/input/imageLists/<user>.list.
    • Extract the thing annotations for all images in Matlab: extractThings().
    • Extract the superpixels for all images in Matlab: extractSLICOSuperpixels().
    • To enable or disable superpixels, thing annotations and polygon drawing, take a look at the flags at the top of CocoStuffAnnotator.m.
  • Run the annotation tool in Matlab: CocoStuffAnnotator();
    • The tool writes the .mat label files to annotator/data/output/annotations.
    • To create a .png preview of the annotations, run annotator/code/exportImages.m in Matlab. The previews will be saved to annotator/data/output/preview.

Misc

References

Licensing

COCO-Stuff is a derivative work of the COCO dataset. The authors of COCO do not in any form endorse this work. Different licenses apply:

Contact

If you have any questions regarding this dataset, please contact us at holger-at-it-caesar.com.

Owner
Holger Caesar
Author of the COCO-Stuff and nuScenes datasets.
Holger Caesar
git《Investigating Loss Functions for Extreme Super-Resolution》(CVPR 2020) GitHub:

Investigating Loss Functions for Extreme Super-Resolution NTIRE 2020 Perceptual Extreme Super-Resolution Submission. Our method ranked first and secon

Sejong Yang 0 Oct 17, 2022
Pytorch-3dunet - 3D U-Net model for volumetric semantic segmentation written in pytorch

pytorch-3dunet PyTorch implementation 3D U-Net and its variants: Standard 3D U-Net based on 3D U-Net: Learning Dense Volumetric Segmentation from Spar

Adrian Wolny 1.3k Dec 28, 2022
Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Deep Unsupervised Image Hashing by Maximizing Bit Entropy This is the PyTorch implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hash

62 Dec 30, 2022
On-device speech-to-intent engine powered by deep learning

Rhino Made in Vancouver, Canada by Picovoice Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a giv

Picovoice 510 Dec 30, 2022
Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

Shunted Transformer This is the offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation by Sucheng Ren, Daquan Zhou, Shengf

156 Dec 27, 2022
[SDM 2022] Towards Similarity-Aware Time-Series Classification

SimTSC This is the PyTorch implementation of SDM2022 paper Towards Similarity-Aware Time-Series Classification. We propose Similarity-Aware Time-Serie

Daochen Zha 49 Dec 27, 2022
Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

150 Dec 07, 2022
PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

PantheonRL is a package for training and testing multi-agent reinforcement learning environments. PantheonRL supports cross-play, fine-tuning, ad-hoc coordination, and more.

Stanford Intelligent and Interactive Autonomous Systems Group 57 Dec 28, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 0 Dec 15, 2022
Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

Haitao Yang 62 Dec 30, 2022
Marine debris detection with commercial satellite imagery and deep learning.

Marine debris detection with commercial satellite imagery and deep learning. Floating marine debris is a global pollution problem which threatens mari

Inter Agency Implementation and Advanced Concepts 56 Dec 16, 2022
An algorithm study of the 6th iOS 10 set of Boost Camp Web Mobile

알고리즘 스터디 🔥 부스트캠프 웹모바일 6기 iOS 10조의 알고리즘 스터디 입니다. 개인적인 사정 등으로 S034, S055만 참가하였습니다. 스터디 목적 상진: 코테 합격 + 부캠끝나고 아침에 일어나기 위해 필요한 사이클 기완: 꾸준하게 자리에 앉아 공부하기 +

2 Jan 11, 2022
LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference This repository contains PyTorch evaluation code, training code and pretrained

Facebook Research 504 Jan 02, 2023
Vertex AI: Serverless framework for MLOPs (ESP / ENG)

Vertex AI: Serverless framework for MLOPs (ESP / ENG) Español Qué es esto? Este repo contiene un pipeline end to end diseñado usando el SDK de Kubeflo

Hernán Escudero 2 Apr 28, 2022
Select, weight and analyze complex sample data

Sample Analytics In large-scale surveys, often complex random mechanisms are used to select samples. Estimates derived from such samples must reflect

samplics 37 Dec 15, 2022
GoodNews Everyone! Context driven entity aware captioning for news images

This is the code for a CVPR 2019 paper, called GoodNews Everyone! Context driven entity aware captioning for news images. Enjoy! Model preview: Huge T

117 Dec 19, 2022
Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

TensorFlow White Paper Notes Features Notes broken down section by section, as well as subsection by subsection Relevant links to documentation, resou

Sam Abrahams 437 Oct 09, 2022
COCO Style Dataset Generator GUI

A simple GUI-based COCO-style JSON Polygon masks' annotation tool to facilitate quick and efficient crowd-sourced generation of annotation masks and bounding boxes. Optionally, one could choose to us

Hans Krupakar 142 Dec 09, 2022
PyGCL: A PyTorch Library for Graph Contrastive Learning

PyGCL is a PyTorch-based open-source Graph Contrastive Learning (GCL) library, which features modularized GCL components from published papers, standa

PyGCL 588 Dec 31, 2022
Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets" (ECCV 2020 Spotlight)

Distribution-Balanced Loss [Paper] The implementation of our paper Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets (

Tong WU 304 Dec 22, 2022