The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

Last update: Dec 01, 2022

Related tags

Deep Learning interscript

Overview

Interscript

The Interscript dataset contains interactive user feedback on a T5-11B model generated scripts.

Dataset

data.json contains the data in an easy to read JSON format. data.jsonl contains the data in a JSONL format. The file contains 8466 samples, one sample per line. Every sample is a JSON object with the following fields:

 {
        "input_script": "push chair in -> pull chair in; pull chair in -> push chair against wall; push chair against wall -> straighten chair legs; straighten chair legs -> Push all chairs in; line up the chairs -> push chair in",
        "input_feedback": "One would not pull chair in if they had initially pushed it in.",
        "output_script": "push chair against wall -> straighten chair legs;straighten chair legs -> Push all chairs in;line up the chairs -> push chair in;push chair in -> push chair against wall",
        "metadata": {
            "id": "301KG0KX9BKTC0HB7Z9SV1Y5HAFH2Y.2_implicit.gp",
            "goal": "push all chairs in",
            "is_distractor": false,
            "feedback_type": "implicit.gp",
            "edit": "Remove node 'pull chair in'",
            "input_script_formatted": [
                "1. line up the chairs",
                "2. push chair in",
                "3. pull chair in",
                "4. push chair against wall",
                "5. straighten chair legs",
                "6. Push all chairs in"
            ],
            "output_script_formatted": [
                "1. line up the chairs",
                "2. push chair in",
                "3. push chair against wall",
                "4. straighten chair legs",
                "5. Push all chairs in"
            ]
        }
    }

The description of the fields is as follows:

input_script: Model generated script $y_{bad}$.
input_feedback: User feedback on the input script $f$.
output_script: Fixed output script $y_{good}$.

Metadata contains additional information about the sample. Some important fields are:

id: Unique identifier of the sample.
goal: Goal of the script.
is_distractor: Whether the feedback is a distractor (please see Section 4 for more details).
feedback_type: Type of feedback (please see Section 4 "Annotation" for more details).
edit: The input_feedback presented as an edit operation on the input script, that is, the edit operation that transforms the input script into the output script.
input_script_formatted: The input script presented as a list of sentences.
output_script_formatted: The output script presented as a list of sentences.

Data collection process

We use Amazon Mechanical Turk to collect feedback on erroneous scripts from users.
An overview of the process is captured in the following figure:

Amazon Mechanical Turk Template

turk_template.html contains the template for Amazon Mechanical Turk HITs.

The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

Related tags

Overview

Interscript

Dataset

Data collection process

Amazon Mechanical Turk Template

Owner

AI2

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)

Fast, general, and tested differentiable structured prediction in PyTorch

OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

Instant neural graphics primitives: lightning fast NeRF and more

PointPillars inference with TensorRT

Tweesent-back - Tweesent backend uses fastAPI as the web framework

A set of tools to pre-calibrate and calibrate (multi-focus) plenoptic cameras (e.g., a Raytrix R12) based on the libpleno.

(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Haze Removal can remove slight to extreme cases of haze affecting an image

This repository is all about spending some time the with the original problem posed by Minsky and Papert

This is an easy python software which allows to sort images with faces by gender and after by age.

atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

Python framework for Stochastic Differential Equations modeling

Download and preprocess popular sequential recommendation datasets

Educational 2D SLAM implementation based on ICP and Pose Graph

Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

Data Augmentation with Variational Autoencoders

[CVPR 2021] Monocular depth estimation using wavelets for efficiency