Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Overview

GDAP

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Environment

  • Python (verified: v3.8)
  • CUDA (verified: v11.1)
  • Packages (see requirements.txt)

Usage

Preprocessing

We follow dygiepp for data preprocessing.

  • text2et: Event Type Detection
  • ettext2tri: Trigger Extraction
  • etrttext2role: Argument Extraction
# data processed by dyieapp
data/text2target/dyiepp_ace1005_ettext2tri_subtype
├── event.schema 
├── test.json
├── train.json
└── val.json

# data processed by  data_convert.convert_text_to_target
data/text2target/dyiepp_ace1005_ettext2tri_subtype
├── event.schema
├── test.json
├── train.json
└── val.json

Useful commands:

python -m data_convert.convert_text_to_target # data/raw_data -> data/text2target
python convert_dyiepp_to_sentence.py data/raw_data/dyiepp_ace2005 # doc -> sentence, used in evaluation

Training

Relevant scripts:

  • run_seq2seq.py: Python code entry, modified from the transformers/examples/seq2seq/run_seq2seq.py
  • run_seq2seq_span.bash: Model training script logging to the log file.

Example (see the above two files for more details):

# ace05 event type detection t5-base, the metric_format use eval_trigger-F1 
bash run_seq2seq_span.bash --data=dyiepp_ace2005_text2et_subtype --model=t5-base --format=et --metric_format=eval_trigger-F1

# ace05 tri extraction t5-base
bash run_seq2seq_span.bash --data=dyiepp_ace2005_ettext2tri_subtype --model=t5-base --format=tri --metric_format=eval_trigger-F1

# ace05 argument extraction t5-base
bash run_seq2seq_span.bash --data=dyiepp_ace2005_etrttext2role_subtype --model=t5-base --format=role --metric_format=eval_role-F1

Trained models are saved in the models/ folder.

Evaluation

  • run_tri_predict.bash: trigger extraction evaluation and inference script.
  • run_arg_predict.bash: argument extraction evaluation and inference script.

Todo

We aim to expand the codebase for a wider range of tasks, including

  • Name Entity Recognition
  • Keyword Generation
  • Event Relation Identification

If you find this repo helpful...

Please give us a and cite our paper as

@misc{si2021-GDAP,
      title={Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works}, 
      author={Jinghui Si and Xutan Peng and Chen Li and Haotian Xu and Jianxin Li},
      year={2021},
      eprint={2110.04525},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

This project borrows code from Text2Event

Machine learning notebooks in different subjects optimized to run in google collaboratory

Notebooks Name Description Category Link Training pix2pix This notebook shows a simple pipeline for training pix2pix on a simple dataset. Most of the

Zaid Alyafeai 363 Dec 06, 2022
Google Landmark Recogntion and Retrieval 2021 Solutions

Google Landmark Recogntion and Retrieval 2021 Solutions In this repository you can find solution and code for Google Landmark Recognition 2021 and Goo

Vadim Timakin 5 Nov 25, 2022
Car Parking Tracker Using OpenCv

Car Parking Vacancy Tracker Using OpenCv I used basic image processing methods i

Adwait Kelkar 30 Dec 03, 2022
Official repository for Natural Image Matting via Guided Contextual Attention

GCA-Matting: Natural Image Matting via Guided Contextual Attention The source codes and models of Natural Image Matting via Guided Contextual Attentio

Li Yaoyi 349 Dec 26, 2022
Generative Flow Networks for Discrete Probabilistic Modeling

Energy-based GFlowNets Code for Generative Flow Networks for Discrete Probabilistic Modeling by Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Vo

Narsil-Dinghuai Zhang 51 Dec 20, 2022
Python suite to construct benchmark machine learning datasets from the MIMIC-III clinical database.

MIMIC-III Benchmarks Python suite to construct benchmark machine learning datasets from the MIMIC-III clinical database. Currently, the benchmark data

Chengxi Zang 6 Jan 02, 2023
Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network An official PyTorch implementation of the RBSRICNN network as desc

Rao Muhammad Umer 6 Nov 14, 2022
Efficient Deep Learning Systems course

Efficient Deep Learning Systems This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Sc

Max Ryabinin 173 Dec 29, 2022
This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge.

Data-Science-Intern-Challenge This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge. Summer 2022 Data Science Inte

1 Jan 11, 2022
Image restoration with neural networks but without learning.

Warning! The optimization may not converge on some GPUs. We've personally experienced issues on Tesla V100 and P40 GPUs. When running the code, make s

Dmitry Ulyanov 7.4k Jan 01, 2023
Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

CoGAIL Table of Content Overview Installation Dataset Training Evaluation Trained Checkpoints Acknowledgement Citations License Overview This reposito

Jeremy Wang 29 Dec 24, 2022
Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

UniRE Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021. Requirements python: 3.7.6 pytorch: 1.8.1 transformers:

Wang Yijun 109 Nov 29, 2022
Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).

Revisiting spatio-temporal layouts for compositional action recognition Codebase for "Revisiting spatio-temporal layouts for compositional action reco

Gorjan 20 Dec 15, 2022
The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 111 Dec 18, 2022
An implementation of IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification

IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification The repostiory consists of the code, results and data set links for

12 Dec 26, 2022
[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

OW-DETR: Open-world Detection Transformer (CVPR 2022) [Paper] Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Sh

Akshita Gupta 127 Dec 27, 2022
Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Toward Practical Monocular Indoor Depth Estimation Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su [arXiv] [project site] DistDe

Meta Research 122 Dec 13, 2022
Code repository for the paper "Tracking People with 3D Representations"

Tracking People with 3D Representations Code repository for the paper "Tracking People with 3D Representations" (paper link) (project site). Jathushan

Jathushan Rajasegaran 77 Dec 03, 2022
NAVER BoostCamp Final Project

CV 14조 final project Super Resolution and Deblur module Inference code & Pretrained weight Repo SwinIR Deblur 실행 방법 streamlit run WebServer/Server_SRD

JiSeong Kim 5 Sep 06, 2022
Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).

GD-VCR Code for Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning (EMNLP 2021). Research Questions and Aims: How well can a model perform o

Da Yin 24 Oct 13, 2022