Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

Overview

SPLASH: Semantic Parsing with Language Assistance from Humans

SPLASH is dataset for the task of semantic parse correction with natural language feedback in the context of text-to-SQL parsing.

Example

The task, dataset along with baseline results are presented in
Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback.
Ahmed Elgohary, Saghar Hosseini and Ahmed Hassan Awadallah.
ACL 2020.

Release

The train.json, dev.json and test.json contain the training, development and testing examples of SPLASH. In addition to that, we also release the 179 examples that are based on the EditSQL parser (Please, see section 6.3 in the paper for more details). The EditSQL examples are in editsql.json. SPLASH is distributed under the CC BY-SA 4.0 license.

Format

Each example contains the following fields:

db_id: Name of Spider database.

question: Question (Utterance) as provided in Spider.

predicted_parse: The predicted SQL parse by the relevant model.

predicted_parse_with_values: The predicted SQL with the values (annonomized in predicted_parse) inferred by a rule-based post-processor. Note that we still use Spider's evaluation measure which ignores the values, but inferring values for the predicted parse is essential for generating meaningful explanations.

predicted_parse_explanation: The generated natural language explanation of the predicted SQL.

feedback: Collected natural language feedback.

gold_parse: The gold parse of the given question as provided in Spider.

beam: The top 20 predictions with corresponding scores produced by Seq2Struct beam search.

Please, refer to the paper for more details.

Example

    {
        "db_id": "csu_1", 
        "question": "Which university is in Los Angeles county and opened after 1950?", 
        "predicted_parse": "SELECT T1.Campus FROM Campuses AS T1 JOIN faculty AS T2 ON T1.Id = T2.Campus WHERE T1.County = value AND T1.Year > value AND T2.Year > value", 
        "predicted_parse_with_values": "SELECT T1.Campus FROM Campuses AS T1 JOIN faculty AS T2 ON T1.Id = T2.Campus WHERE T1.County = \"Los Angeles\" AND T1.Year > 1950 AND T2.Year > 2002",
        "predicted_parse_explanation": [
            "Step 1: For each row in Campuses table, find the corresponding rows in faculty     
            table", 
            "Step 2: find Campuses's Campus of the results of step 1 whose County equals Los 
             Angeles and Campuses's Year greater than 1950 and faculty's Year greater than 2002"
        ],
        "feedback": "In step 2 Remove faculty 's year greater than 2002\".", 
        "gold_parse": "SELECT campus FROM campuses WHERE county  =  \"Los Angeles\" AND YEAR  >  
        1950", 
        "beam": [
            [
                "SELECT T1.Campus FROM Campuses AS T1 JOIN faculty AS T2 ON T1.Id = T2.Campus WHERE T1.County = value AND T2.Year > value AND T2.Year > value", 
                -1.5820374488830566
            ], 
            [
                "SELECT T1.County FROM Campuses AS T1 JOIN faculty AS T2 ON T1.Id = T2.Campus WHERE T1.Campus = value AND T2.Year > value AND T2.Year > value", 
                -2.0078020095825195
            ], 
            ..
  }          

Please, contact Ahmed Elgohary < [email protected] > for any questions/feedback.

Citation

@inproceedings{Elgohary20Speak,
Title = {Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback},
Author = {Ahmed Elgohary and Saghar Hosseini and Ahmed Hassan Awadallah},
Year = {2020},
Booktitle = {Association for Computational Linguistics},
}
Owner
Microsoft Research - Language and Information Technologies (MSR LIT)
Microsoft Research - Language and Information Technologies (MSR LIT)
STMTrack: Template-free Visual Tracking with Space-time Memory Networks

STMTrack This is the official implementation of the paper: STMTrack: Template-free Visual Tracking with Space-time Memory Networks. Setup Prepare Anac

Zhihong Fu 62 Dec 21, 2022
PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks

PyDEns PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks. With PyDEns one can solve PD

Data Analysis Center 220 Dec 26, 2022
DaReCzech is a dataset for text relevance ranking in Czech

Dataset DaReCzech is a dataset for text relevance ranking in Czech. The dataset consists of more than 1.6M annotated query-documents pairs,

Seznam.cz a.s. 8 Jul 26, 2022
ECAENet (TensorFlow and Keras)

ECAENet: EfficientNet with Efficient Channel Attention for Plant Species Recognition (SCI:Q3) (Journal of Intelligent & Fuzzy Systems)

4 Dec 22, 2022
Vpw analyzer - A visual J1850 VPW analyzer written in Python

VPW Analyzer A visual J1850 VPW analyzer written in Python Requires Tkinter, Pan

7 May 01, 2022
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Generate indoor scenes with Transformers

SceneFormer: Indoor Scene Generation with Transformers Initial code release for the Sceneformer paper, contains models, train and test scripts for the

Chandan Yeshwanth 110 Dec 06, 2022
PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

PyGAD: Genetic Algorithm in Python PyGAD is an open-source easy-to-use Python 3 library for building the genetic algorithm and optimizing machine lear

Ahmed Gad 1.1k Dec 26, 2022
[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

OpenCOOD OpenCOOD is an Open COOperative Detection framework for autonomous driving. It is also the official implementation of the ICRA 2022 paper OPV

Runsheng Xu 322 Dec 23, 2022
Predicts an answer in yes or no.

Oui-ou-non-prediction Predicts an answer in 'yes' or 'no'. It is based on the game 'effeuiller la marguerite' in which the person plucks flower petals

Ananya Gupta 1 Jan 15, 2022
FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

This repository contains the code accompanying the paper " FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation" Paper link: R

20 Jun 29, 2022
Motion planning environment for Sampling-based Planners

Sampling-Based Motion Planners' Testing Environment Sampling-based motion planners' testing environment (sbp-env) is a full feature framework to quick

Soraxas 23 Aug 23, 2022
Reviatalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation

Reviatalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation This is the implementation of the approach describ

Taosha Fan 47 Nov 15, 2022
Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

Computational Fluid Dynamics in Python Using NumPy to solve the equations of fluid mechanics 🌊 🌊 🌊 together with Finite Differences, explicit time

Felix Köhler 4 Nov 12, 2022
[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

DASR Pytorch implementation of "Unsupervised Degradation Representation Learning for Blind Super-Resolution", CVPR 2021 [arXiv] Overview Requirements

Longguang Wang 318 Dec 24, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
A Python library for common tasks on 3D point clouds

Point Cloud Utils (pcu) - A Python library for common tasks on 3D point clouds Point Cloud Utils (pcu) is a utility library providing the following fu

Francis Williams 622 Dec 27, 2022
Bu repo SAHI uygulamasını mantığını öğreniyoruz.

SAHI-Learn: SAHI'den Beraber Kodlamak İster Misiniz Herkese merhabalar ben Kadir Nar. SAHI kütüphanesine gönüllü geliştiriciyim. Bu repo SAHI kütüphan

Kadir Nar 11 Aug 22, 2022
Sparse-dense operators implementation for Paddle

Sparse-dense operators implementation for Paddle This module implements coo, csc and csr matrix formats and their inter-ops with dense matrices. Feel

北海若 3 Dec 17, 2022
RMNA: A Neighbor Aggregation-Based Knowledge Graph Representation Learning Model Using Rule Mining

RMNA: A Neighbor Aggregation-Based Knowledge Graph Representation Learning Model Using Rule Mining Our code is based on Learning Attention-based Embed

宋朝都 4 Aug 07, 2022