A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.

Last update: Dec 30, 2022

Related tags

Overview

IconQA

About

IconQA is a new diverse abstract visual question answering dataset that highlights the importance of abstract diagram understanding and comprehensive cognitive reasoning in real-world problems.

There are three different sub-tasks in IconQA:

57,672 image choice MC questions
31,578 text chioce MC questions
18,189 fill-in-the-blank questions

Sub-Tasks	Train	Validation	Test	Total
Multi-image-choice	34,603	11,535	11,535	57,672
Multi-text-choice	18,946	6,316	6,316	31,578
Filling-in-the-blank	10,913	3,638	3,638	18,189

In addition to IconQA, we also present Icon645, a large-scale dataset of icons that cover a wide range of objects:

645,687 colored icons
377 different icon classes

For more details, you can find our website here and our paper here.

Download

Our dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please read the license before you use, change, or share our dataset.

You can download IconQA here. Or run the commands by:

cd data
wget https://iconqa2021.s3.us-west-1.amazonaws.com/iconqa.zip
unzip iconqa.zip

You can download Icon645 here. Or run the commands by:

cd data
wget https://iconqa2021.s3.us-west-1.amazonaws.com/icon645.zip
unzip icon645.zip

File structures for the IconQA dataset:

IconQA
|   LICENSE.md
|   metadata.json
|   pid2skills.json
|   pid_splits.json
|   problems.json
|   skills.json
└───test
│   │
│   └───choose_img
│   |   |
│   |   └───question_id
│   |   |   |   image.png
|   |   |   |   data.json
|   |   |   |   choice_0.png
|   |   |   |   choice_1.png
|   |   |   |   ...
|   |   |
|   |   └───question_id
|   |   |   ...
|   |   
|   └───choose_txt
|   |   |  
|   |   └───question_id
|   |   |   |   image.png
|   |   |   |   data.json
|   |   | 
|   |   └───question_id
|   |   |   ...
|   |
|   └───fill_in_blank
|       |  
|       └───question_id
|       |   |   image.png
|       |   |   data.json
|       | 
|       └───question_id
|       |   ...
|   
└───train
|   |   same as test
|   
└───val
    |   same as test

File structures for the Icon645 dataset:

Icon645
|   LICENCE.md
|   metadata.json
└───colored_icons_final
    |
    └───acorn
    |   |   image_id1.png
    |   |   image_id2.png
    |   |   ...
    |   
    └───airplane
    |   |   image_id3.png
    |   |   ...
    |      
    |   ...

Citation

If the paper or the dataset inspires you, please cite us:

@inproceedings{lu2021iconqa,
  title = {IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning},
  author = {Lu, Pan and Qiu, Liang and Chen, Jiaqi and Xia, Tony and Zhao, Yizhou and Zhang, Wei and Yu, Zhou and Liang, Xiaodan and Zhu, Song-Chun},
  booktitle = {Submitted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks},
  year = {2021}
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.

Related tags

Overview

IconQA

About

Download

Citation

License

Owner

Pan Lu

FAST Aiming at the problems of cumbersome steps and slow download speed of GNSS data

Make your master artistic punk avatar through machine learning world famous paintings.

source code of “Visual Saliency Transformer” (ICCV2021)

Hierarchical Attentive Recurrent Tracking

Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation

Code, pre-trained models and saliency results for the paper "Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images".

Space Ship Simulator using python

A custom DeepStack model that has been trained detecting ONLY the USPS logo

PyTorch Implementation for Fracture Detection in Wrist Bone X-ray Images

A graph adversarial learning toolbox based on PyTorch and DGL.

Tutorial materials for Part of NSU Intro to Deep Learning with PyTorch.

Code release for "Self-Tuning for Data-Efficient Deep Learning" (ICML 2021)

Reimplementation of Learning Mesh-based Simulation With Graph Networks

A system for quickly generating training data with weak supervision

Official page of Patchwork (RA-L'21 w/ IROS'21)

Image Fusion Transformer

Pytorch Implementation of Interaction Networks for Learning about Objects, Relations and Physics

Detection of PCBA defect

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Object detection GUI based on PaddleDetection