Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

Related tags

Deep Learningsimmc2
Overview

The Second Situated Interactive MultiModal Conversations (SIMMC 2.0) Challenge 2021

Welcome to the Second Situated Interactive Multimodal Conversations (SIMMC 2.0) Track for DSTC10 2021.

The SIMMC challenge aims to lay the foundations for the real-world assistant agents that can handle multimodal inputs, and perform multimodal actions. Similar to the First SIMMC challenge (as part of DSTC9), we focus on the task-oriented dialogs that encompass a situated multimodal user context in the form of a co-observed & immersive virtual reality (VR) environment. The conversational context is dynamically updated on each turn based on the user actions (e.g. via verbal interactions, navigation within the scene). For this challenge, we release a new Immersive SIMMC 2.0 dataset in the shopping domains: furniture and fashion.

Organizers: Seungwhan Moon, Satwik Kottur, Paul A. Crook, Ahmad Beirami, Babak Damavandi, Alborz Geramifard

Example from SIMMC

Example from SIMMC-Furniture Dataset

Latest News

  • [June 14, 2021] Challenge announcement. Training / development datasets (SIMMC v2.0) are released.

Important Links

Timeline

Date Milestone
June 14, 2021 Training & development data released
Sept 24, 2021 Test-Std data released, End of Challenge Phase 1
Oct 1, 2021 Entry submission deadline, End of Challenge Phase 2
Oct 8, 2021 Final results announced

Track Description

Tasks and Metrics

We present four sub-tasks primarily aimed at replicating human-assistant actions in order to enable rich and interactive shopping scenarios.

Sub-Task #1 Multimodal Disambiguation
Goal To classify if the assistant should disambiguate in the next turn
Input Current user utterance, Dialog context, Multimodal context
Output Binary label
Metrics Binary classification accuracy
Sub-Task #2 Multimodal Coreference Resolution
Goal To resolve referent objects to thier canonical ID(s) as defined by the catalog.
Input Current user utterance with objection mentions, Dialog context, Multimodal context
Output Canonical object IDs
Metrics Coref F1 / Precision / Recall
Sub-Task #3 Multimodal Dialog State Tracking (MM-DST)
Goal To track user belief states across multiple turns
Input Current user utterance, Dialogue context, Multimodal context
Output Belief state for current user utterance
Metrics Slot F1, Intent F1
Sub-Task #4 Multimodal Dialog Response Generation & Retrieval
Goal To generate Assistant responses or retrieve from a candidate pool
Input Current user utterance, Dialog context, Multimodal context, (Ground-truth API Calls)
Output Assistant response utterance
Metrics Generation: BLEU-4, Retrieval: MRR, [email protected], [email protected], [email protected], Mean Rank

Please check the task input file for a full description of inputs for each subtask.

Evaluation

For the DSTC10 SIMMC Track, we will do a two phase evaluation as follows.

Challenge Period 1: Participants will evaluate the model performance on the provided devtest set. At the end of Challenge Period 1 (Sept 24), we ask participants to submit their model prediction results and a link to their code repository.

Challenge Period 2: A test-std set will be released on Sept 28 for the participants who submitted the results for the Challenge Period 1. We ask participants to submit their model predictions on the test-std set by Oct 1. We will announce the final results and the winners on Oct 8.

Challenge Instructions

(1) Challenge Registration

  • Fill out this form to register at DSTC10. Check “Track 3: SIMMC 2.0: Situated Interactive Multimodal Conversational AI” along with other tracks you are participating in.

(2) Download Datasets and Code

  • Irrespective of participation in the challenge, we'd like to encourge those interested in this dataset to complete this optional survey. This will also help us communicate any future updates on the codebase, the datasets, and the challenge track.

  • Git clone our repository to download the datasets and the code. You may use the provided baselines as a starting point to develop your models.

$ git lfs install
$ git clone https://github.com/facebookresearch/simmc2.git

(3) Reporting Results for Challenge Phase 1

  • Submit your model prediction results on the devtest set, following the submission instructions.
  • We will release the test-std set (with ground-truth labels hidden) on Sept 24.

(4) Reporting Results for Challenge Phase 2

  • Submit your model prediction results on the test-std set, following the submission instructions.
  • We will evaluate the participants’ model predictions using the same evaluation script for Phase 1, and announce the results.

Contact

Questions related to SIMMC Track, Data, and Baselines

Please contact [email protected], or leave comments in the Github repository.

DSTC Mailing List

If you want to get the latest updates about DSTC10, join the DSTC mailing list.

Citations

If you want to publish experimental results with our datasets or use the baseline models, please cite the following articles:

@article{kottur2021simmc,
  title={SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations},
  author={Kottur, Satwik and Moon, Seungwhan and Geramifard, Alborz and Damavandi, Babak},
  journal={arXiv preprint arXiv:2104.08667},
  year={2021}
}

NOTE: The paper above describes in detail the datasets, the collection process, and some of the baselines we provide in this challenge. The paper reports the results from an earlier version of the dataset and with different train-dev-test splits, hence the baseline performances on the challenge resources will be slightly different.

License

SIMMC 2.0 is released under CC-BY-NC-SA-4.0, see LICENSE for details.

Owner
Facebook Research
Facebook Research
Convolutional Neural Networks

Darknet Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation. D

Joseph Redmon 23.7k Jan 05, 2023
SIMULEVAL A General Evaluation Toolkit for Simultaneous Translation

SimulEval SimulEval is a general evaluation framework for simultaneous translation on text and speech. Requirement python = 3.7.0 Installation git cl

Facebook Research 48 Dec 28, 2022
AVD Quickstart Containerlab

AVD Quickstart Containerlab WARNING This repository is still under construction. It's fully functional, but has number of limitations. For example: RE

Carl Buchmann 3 Apr 10, 2022
A multi-scale unsupervised learning for deformable image registration

A multi-scale unsupervised learning for deformable image registration Shuwei Shao, Zhongcai Pei, Weihai Chen, Wentao Zhu, Xingming Wu and Baochang Zha

ShuweiShao 2 Apr 13, 2022
Robot Reinforcement Learning on the Constraint Manifold

Implementation of "Robot Reinforcement Learning on the Constraint Manifold"

31 Dec 05, 2022
[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation

Target Adaptive Context Aggregation for Video Scene Graph Generation This is a PyTorch implementation for Target Adaptive Context Aggregation for Vide

Multimedia Computing Group, Nanjing University 44 Dec 14, 2022
Augmented CLIP - Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.

Train aug_clip against laion400m-embeddings found here: https://laion.ai/laion-400-open-dataset/ - note that this used the base ViT-B/32 CLIP model. S

Peter Baylies 55 Sep 13, 2022
Simple Dynamic Batching Inference

Simple Dynamic Batching Inference 解决了什么问题? 众所周知,Batch对于GPU上深度学习模型的运行效率影响很大。。。 是在Inference时。搜索、推荐等场景自带比较大的batch,问题不大。但更多场景面临的往往是稀碎的请求(比如图片服务里一次一张图)。 如果

116 Jan 01, 2023
Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.

Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In

12 Feb 08, 2022
iNAS: Integral NAS for Device-Aware Salient Object Detection

iNAS: Integral NAS for Device-Aware Salient Object Detection Introduction Integral search design (jointly consider backbone/head structures, design/de

顾宇超 77 Dec 02, 2022
TransReID: Transformer-based Object Re-Identification

TransReID: Transformer-based Object Re-Identification [arxiv] The official repository for TransReID: Transformer-based Object Re-Identification achiev

569 Dec 30, 2022
[CVPR'22] COAP: Learning Compositional Occupancy of People

COAP: Compositional Articulated Occupancy of People Paper | Video | Project Page This is the official implementation of the CVPR 2022 paper COAP: Lear

Marko Mihajlovic 111 Dec 11, 2022
Plato: A New Framework for Federated Learning Research

a new software framework to facilitate scalable federated learning research.

System <a href=[email protected] Lab"> 192 Jan 05, 2023
The code uses SegFormer for Semantic Segmentation on Drone Dataset.

SegFormer_Segmentation The code uses SegFormer for Semantic Segmentation on Drone Dataset. The details for the SegFormer can be obtained from the foll

Dr. Sander Ali Khowaja 1 May 08, 2022
[Nature Machine Intelligence' 21] "Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence"

[UCADI] COVID-19 Diagnosis With Federated Learning Intro We developed a Federated Learning (FL) Framework for global researchers to collaboratively tr

HUST EIC AI-LAB 30 Dec 12, 2022
The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

Decoupled Dynamic Filter Networks This repo is the official implementation of CVPR2021 paper: "Decoupled Dynamic Filter Networks". Introduction DDF is

F.S.Fire 180 Dec 30, 2022
Datasets, Transforms and Models specific to Computer Vision

vision Datasets, Transforms and Models specific to Computer Vision Installation First install the nightly version of OneFlow python3 -m pip install on

OneFlow 68 Dec 07, 2022
Interpretation of T cell states using reference single-cell atlases

Interpretation of T cell states using reference single-cell atlases ProjecTILs is a computational method to project scRNA-seq data into reference sing

Cancer Systems Immunology Lab 139 Jan 03, 2023
RaceBERT -- A transformer based model to predict race and ethnicty from names

RaceBERT -- A transformer based model to predict race and ethnicty from names Installation pip install racebert Using a virtual environment is highly

Prasanna Parasurama 3 Nov 02, 2022
LogAvgExp - Pytorch Implementation of LogAvgExp

LogAvgExp - Pytorch Implementation of LogAvgExp for Pytorch Install $ pip instal

Phil Wang 31 Oct 14, 2022