ATLOP

Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling.

If you make use of this code in your work, please kindly cite the following paper:

@inproceedings{zhou2021atlop,
	title={Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling},
	author={Zhou, Wenxuan and Huang, Kevin and Ma, Tengyu and Huang, Jing},
	booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
	year={2021}
}

Requirements

Python (tested on 3.7.4)
CUDA (tested on 10.2)
PyTorch (tested on 1.7.0)
Transformers (tested on 3.4.0)
numpy (tested on 1.19.4)
apex (tested on 0.1)
opt-einsum (tested on 3.3.0)
wandb
ujson
tqdm

Dataset

The DocRED dataset can be downloaded following the instructions at link. The CDR and GDA datasets can be obtained following the instructions in edge-oriented graph. The expected structure of files is:

ATLOP
 |-- dataset
 |    |-- docred
 |    |    |-- train_annotated.json        
 |    |    |-- train_distant.json
 |    |    |-- dev.json
 |    |    |-- test.json
 |    |-- cdr
 |    |    |-- train_filter.data
 |    |    |-- dev_filter.data
 |    |    |-- test_filter.data
 |    |-- gda
 |    |    |-- train.data
 |    |    |-- dev.data
 |    |    |-- test.data
 |-- meta
 |    |-- rel2id.json

Training and Evaluation

DocRED

Train the BERT model on DocRED with the following command:

>> sh scripts/run_bert.sh  # for BERT
>> sh scripts/run_roberta.sh  # for RoBERTa

The training loss and evaluation results on the dev set are synced to the wandb dashboard.

The program will generate a test file result.json in the official evaluation format. You can compress and submit it to Colab for the official test score.

CDR and GDA

Train CDA and GDA model with the following command:

>> sh scripts/run_cdr.sh  # for CDR
>> sh scripts/run_gda.sh  # for GDA

The training loss and evaluation results on the dev and test set are synced to the wandb dashboard.

Saving and Evaluating Models

You can save the model by setting the --save_path argument before training. The model correponds to the best dev results will be saved. After that, You can evaluate the saved model by setting the --load_path argument, then the code will skip training and evaluate the saved model on benchmarks. I've also released the trained atlop-bert-base and atlop-roberta models.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
scripts		scripts
.gitignore		.gitignore
README.md		README.md
evaluation.py		evaluation.py
long_seq.py		long_seq.py
losses.py		losses.py
model.py		model.py
prepro.py		prepro.py
train.py		train.py
train_bio.py		train_bio.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scripts

scripts

.gitignore

.gitignore

README.md

README.md

evaluation.py

evaluation.py

long_seq.py

long_seq.py

losses.py

losses.py

model.py

model.py

prepro.py

prepro.py

train.py

train.py

train_bio.py

train_bio.py

utils.py

utils.py

Repository files navigation

ATLOP

Requirements

Dataset

Training and Evaluation

DocRED

CDR and GDA

Saving and Evaluating Models

About

Releases 1

Packages

Languages

wzhouad/ATLOP

Folders and files

Latest commit

History

Repository files navigation

ATLOP

Requirements

Dataset

Training and Evaluation

DocRED

CDR and GDA

Saving and Evaluating Models

About

Resources

Stars

Watchers

Forks

Languages