The Third Situated Interactive MultiModal Conversations (SIMMC 2.1) Challenge 2022

Welcome to the Third Situated Interactive Multimodal Conversations (SIMMC 2.1) Track page for DSTC11 2022.

The SIMMC challenge aims to lay the foundations for the real-world assistant agents that can handle multimodal inputs, and perform multimodal actions. Specifically, we focus on the task-oriented dialogs that encompass a situated multimodal user context in the form of a co-observed & immersive virtual reality (VR) environment. The conversational context is dynamically updated on each turn based on the user actions (e.g. via verbal interactions, navigation within the scene).

The Second SIMMC challenge ended successfully, receiving a number of new state-of-the-art models in the novel multimodal dialog task. Building upon the success of the previous editions of the SIMMC challenges, we propose a third edition of the SIMMC challenge for the community to tackle and continue the effort towards building a successful multimodal assistant agent.

In this edition of the challenge, we specifically focus on the key challenge of fine-grained visual disambiguation, which adds an important skill to assistant agents studied in the previous SIMMC challenge. To accommodate for this challenge, we provide the improved version of the of the dataset, SIMMC 2.1, where we augment the SIMMC 2.0 dataset with additional annotations (i.e. identification of all possible referent candidates given ambiguous mentions) and corresponding re-paraphrases to support the study and modeling of visual disambiguation (SIMMC 2.1).

On Nov 4, 2022, the SIMMC challenge has successfully ended, receiving 15 model submissions across 5 teams from universities and industry. We thank everyone for their participation in the challenge. The detailed evaluation metrics and the links to the repositories can be found here.

Organizers: Seungwhan Moon, Satwik Kottur, Babak Damavandi, Alborz Geramifard

Illustration of the SIMMC 2.1 Dataset

Latest News

[Nov 4, 2022] Results for DSTC11-SIMMC2.1 challenge entries have been announced.
[Oct 21, 2022] Test-std dataset (SIMMC v2.1) is released. Start of Challenge Period 2.
[June 28, 2022] DSTC11-SIMMC2.1 Challenge announcement. Training / development datasets (SIMMC v2.1) are released.

Important Links

Timeline

NOTE: All deadlines are 11:59PM UTC-12:00 ("anywhere on Earth"), unless otherwise noted.

Date	Milestone
June 28, 2022	Training & development data released
Oct 21, 2022	Test-Std data released, End of Challenge Phase 1
Oct 28, 2022	Entry submission deadline, End of Challenge Phase 2
Nov 4, 2022	Final results are announced (15 model submissions).
Nov 18, 2022	Deadline for DSTC11 Workshop Paper Submission

Track Description

Tasks and Metrics

For this edition of the challenge, we focus on four sub-tasks primarily aimed at replicating human-assistant actions in order to enable rich and interactive shopping scenarios.

For more detailed information on the new SIMMC 2.1 dataset and the instructions, please refer to the DSTC11 challenge proposal document.

Sub-Task #1	Ambiguous Candidate Identification (New)
Goal	Given ambiguous object mentions, to resolve referent objects to thier canonical ID(s).
Input	Current user utterance, Dialog context, Multimodal context
Output	Canonical object IDs
Metrics	Object Identification F1 / Precision / Recall

Sub-Task #2	Multimodal Coreference Resolution
Goal	To resolve referent objects to thier canonical ID(s) as defined by the catalog.
Input	Current user utterance, Dialog context, Multimodal context
Output	Canonical object IDs
Metrics	Coref F1 / Precision / Recall

Sub-Task #3	Multimodal Dialog State Tracking (MM-DST)
Goal	To track user belief states across multiple turns
Input	Current user utterance, Dialogue context, Multimodal context
Output	Belief state for current user utterance
Metrics	Slot F1, Intent F1

Sub-Task #4	Multimodal Dialog Response Generation
Goal	To generate Assistant responses
Input	Current user utterance, Dialog context, Multimodal context, (Ground-truth API Calls)
Output	Assistant response utterance
Metrics	BLEU-4

Please check the task input file for a full description of inputs for each subtask.

Baseline Results

We will provide the baselines for all the four tasks to benchmark their models. Feel free to use the code to bootstrap your model.

Subtask	Name	Baseline Results
#1	Ambiguous Candidate Identification	Link
#2	Multimodal Coreference Resolution	Link
#3	Multimodal Dialog State Tracking (MM-DST)	Link
#4	Multimodal Dialog Response Generation	Link

How to Download Datasets and Code

Git clone our repository to download the datasets and the code. You may use the provided baselines as a starting point to develop your models.

$ git lfs install
$ git clone https://github.com/facebookresearch/simmc2.git

Also please feel free to check out other open-sourced repositories from the previous SIMMC 2.0 challenge here.

Challenge Instructions

(1) Reporting Results for Challenge Phase 1 (by Oct 21)

Submit your model prediction results on the devtest set, following the submission instructions.
We will release the test-std set (with ground-truth labels hidden).

(2) Reporting Results for Challenge Phase 2 (by Oct 28)

Submit your model prediction results on the test-std set, following the submission instructions.
We will evaluate the participants’ model predictions using the same evaluation script for Phase 1, and announce the results.

Contact

Questions related to SIMMC Track, Data, and Baselines

Please contact simmc@fb.com, or leave comments in the Github repository.

DSTC Mailing List

If you want to get the latest updates about DSTC10, join the DSTC mailing list.

Citations

If you want to publish experimental results with our datasets or use the baseline models, please cite the following articles:

@inproceedings{kottur-etal-2021-simmc,
    title = "{SIMMC} 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations",
    author = "Kottur, Satwik  and
      Moon, Seungwhan  and
      Geramifard, Alborz  and
      Damavandi, Babak",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.401",
    doi = "10.18653/v1/2021.emnlp-main.401",
    pages = "4903--4912",
}

NOTE: The paper (EMNLP 2021) above describes in detail the datasets, the collection process, and some of the baselines we provide in this challenge.

License

SIMMC 2 is released under CC-BY-NC-SA-4.0, see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
data		data
dstc10		dstc10
model		model
.gitattributes		.gitattributes
.gitignore		.gitignore
CHALLENGE_RESULTS.md		CHALLENGE_RESULTS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SUBMISSION_INSTRUCTIONS.md		SUBMISSION_INSTRUCTIONS.md
TASK_INPUTS.md		TASK_INPUTS.md
final_results_2022.png		final_results_2022.png
overview-simmc21.png		overview-simmc21.png

License

facebookresearch/simmc2

Folders and files

Latest commit

History

Repository files navigation

The Third Situated Interactive MultiModal Conversations (SIMMC 2.1) Challenge 2022

Latest News

Important Links

Timeline

Track Description

Tasks and Metrics

Baseline Results

How to Download Datasets and Code

Challenge Instructions

(1) Reporting Results for Challenge Phase 1 (by Oct 21)

(2) Reporting Results for Challenge Phase 2 (by Oct 28)

Contact

Questions related to SIMMC Track, Data, and Baselines

DSTC Mailing List

Citations

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages