Automatic Idiomatic Expression Detection

Related tags

Deep LearningDISC
Overview

IDentifier of Idiomatic Expressions via Semantic Compatibility (DISC)

An Idiomatic identifier that detects the presence and span of idiomatic expression in a given sentence.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. License
  5. Contact
  6. Acknowledgements

About The Project

This project is a supervised idiomatic expression identification method. Given a sentence that contains a potentially idiomatic expression (PIE), the model identifies the span of the PIE if it is indeed used in an idiomatic sense, otherwise, the model does not identify the PIE. The identification is done via checking the smemantic compatibility. More details will be updated here (Detail description, figures, etc.).

The paper will appear in TACL.

Built With

This model is heavily relying the resources/libraries list as following:

Getting Started

The implementation here includes processed data created for MAGPIE random-split dataset. The model checkpoint that trained with MAGPIE random-split is also provided.

Prerequisites

All the dependencies for this project is listed in requirements.txt. You can install them via a standard command:

pip install -r requirements.txt

It is highly recommanded to start a conda environment with PyTorch properly installed based on your hardward before install the other requirements.

Checkpoint

To run the model with a pre-trained checkpoint, please first create a ./checkpoints folder at root. Then, please download the checkpoint from Google Drive via this Link. Please put the checkpoint in the ./checkpoints folder.

Usage

Configuration

Before running the demo or experiments (training or testing), please see the config.py which sets the configuration of the model. Some parameters there, such as MODE needs to be set appropriately for the model to run correctly. Please see comments for more details.

Demo

To start, please go through the examples provided in demo.ipynb. In there, we process a given input sentence into the model input data and then run model inference to extract the idiomatic expression (if present) from the input sentence (visualized).

Data processing

To process a dataset (such as MAGPIE) for model training and testing, please refer to ./data_processing/MAGPIE/read_comp_data_processing.ipynb. It takes a dataset with sententences and their PIE lcoations as input and generate all the necessary files for model training and inference.

Training and Testing

For training and testing, please refer to train.py and test.py. Note that test.py is used to produce evaluation scores as shown in the paper. inference.py is used to produce prediction for sentences.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Ziheng Zeng - [email protected]

Project Link: https://github.com/your_username/repo_name

Acknowledgements

[TODO]:

Add the following in README:

  • Method detail descrption
  • Method figure
  • Demo walkthrough
  • Data processing tips and instructions Add requirements.txt
Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model a

Google 3.2k Dec 31, 2022
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Model This repository is the official PyTorch implementation of GraphRNN, a graph gene

Jiaxuan 568 Dec 29, 2022
particle tracking model, works with the ROMS output file(qck.nc, his.nc)

particle-tracking-model-for-ROMS particle tracking model, works with the ROMS output file(qck.nc, his.nc) description this is a 2-dimensional particle

xusheng 1 Jan 11, 2022
Source code for the ACL-IJCNLP 2021 paper entitled "T-DNA: Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation" by Shizhe Diao et al.

T-DNA Source code for the ACL-IJCNLP 2021 paper entitled Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adapta

shizhediao 17 Dec 22, 2022
[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training Code for NeurIPS 2021 paper "Better Safe Than Sorry: Preventing Delu

Lue Tao 29 Sep 20, 2022
Sequential Model-based Algorithm Configuration

SMAC v3 Project Copyright (C) 2016-2018 AutoML Group Attention: This package is a reimplementation of the original SMAC tool (see reference below). Ho

AutoML-Freiburg-Hannover 778 Jan 05, 2023
An end-to-end image translation model with weight-map for color constancy

CCUnet An end-to-end image translation model with weight-map for color constancy 1. Download the dataset (take Colorchecker_recommended dataset as an

Jianhui Qiu 1 Dec 21, 2021
Official PyTorch Implementation for "Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes"

PVDNet: Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes This repository contains the official PyTorch implementatio

Junyong Lee 98 Nov 06, 2022
This respository includes implementations on Manifoldron: Direct Space Partition via Manifold Discovery

Manifoldron: Direct Space Partition via Manifold Discovery This respository includes implementations on Manifoldron: Direct Space Partition via Manifo

dayang_wang 4 Apr 28, 2022
Implementation of "Deep Implicit Templates for 3D Shape Representation"

Deep Implicit Templates for 3D Shape Representation Zerong Zheng, Tao Yu, Qionghai Dai, Yebin Liu. arXiv 2020. This repository is an implementation fo

Zerong Zheng 144 Dec 07, 2022
Virtual hand gesture mouse using a webcam

NonMouse 日本語のREADMEはこちら This is an application that allows you to use your hand itself as a mouse. The program uses a web camera to recognize your han

Yuki Takeyama 55 Jan 01, 2023
RoboDesk A Multi-Task Reinforcement Learning Benchmark

RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro

Google Research 66 Oct 07, 2022
Implementation of: "Exploring Randomly Wired Neural Networks for Image Recognition"

RandWireNN Unofficial PyTorch Implementation of: Exploring Randomly Wired Neural Networks for Image Recognition. Results Validation result on Imagenet

Seung-won Park 684 Nov 02, 2022
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program

50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.

komal_lamba 22 Dec 09, 2022
A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

ICT.MIRACLE lab 75 Dec 26, 2022
CRNN With PyTorch

CRNN-PyTorch Implementation of https://arxiv.org/abs/1507.05717

Vadim 4 Sep 01, 2022
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Recurrent Fast Weight Programmers This is the official repository containing the code we used to produce the experimental results reported in the pape

IDSIA 36 Nov 15, 2022
A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

張致強 14 Dec 02, 2022
Personalized Federated Learning using Pytorch (pFedMe)

Personalized Federated Learning with Moreau Envelopes (NeurIPS 2020) This repository implements all experiments in the paper Personalized Federated Le

Charlie Dinh 226 Dec 30, 2022
This is the repository for The Machine Learning Workshops, published by AI DOJO

This is the repository for The Machine Learning Workshops, published by AI DOJO. It contains all the workshop's code with supporting project files necessary to work through the code.

AI Dojo 12 May 06, 2022