An Open-Source Package for Neural Relation Extraction (NRE)

Last update: Jan 03, 2023

Related tags

Overview

OpenNRE

We have a DEMO website (http://opennre.thunlp.ai/). Try it out!

OpenNRE is an open-source and extensible toolkit that provides a unified framework to implement relation extraction models. This package is designed for the following groups:

New to relation extraction: We have hand-by-hand tutorials and detailed documents that can not only enable you to use relation extraction tools, but also help you better understand the research progress in this field.
Developers: Our easy-to-use interface and high-performance implementation can acclerate your deployment in the real-world applications. Besides, we provide several pretrained models which can be put into production without any training.
Researchers: With our modular design, various task settings and metric tools, you can easily carry out experiments on your own models with only minor modification. We have also provided several most-used benchmarks for different settings of relation extraction.
Anyone who need to submit an NLP homework to impress their professors: With state-of-the-art models, our package can definitely help you stand out among your classmates!

This package is mainly contributed by Tianyu Gao, Xu Han, Shulian Cao, Lumin Tang, Yankai Lin, Zhiyuan Liu

What is Relation Extraction

Relation extraction is a natural language processing (NLP) task aiming at extracting relations (e.g., founder of) between entities (e.g., Bill Gates and Microsoft). For example, from the sentence Bill Gates founded Microsoft, we can extract the relation triple (Bill Gates, founder of, Microsoft).

Relation extraction is a crucial technique in automatic knowledge graph construction. By using relation extraction, we can accumulatively extract new relation facts and expand the knowledge graph, which, as a way for machines to understand the human world, has many downstream applications like question answering, recommender system and search engine.

How to Cite

A good research work is always accompanied by a thorough and faithful reference. If you use or extend our work, please cite the following paper:

@inproceedings{han-etal-2019-opennre,
    title = "{O}pen{NRE}: An Open and Extensible Toolkit for Neural Relation Extraction",
    author = "Han, Xu and Gao, Tianyu and Yao, Yuan and Ye, Deming and Liu, Zhiyuan and Sun, Maosong",
    booktitle = "Proceedings of EMNLP-IJCNLP: System Demonstrations",
    year = "2019",
    url = "https://www.aclweb.org/anthology/D19-3029",
    doi = "10.18653/v1/D19-3029",
    pages = "169--174"
}

It's our honor to help you better explore relation extraction with our OpenNRE toolkit!

Papers and Document

If you want to learn more about neural relation extraction, visit another project of ours (NREPapers).

You can refer to our document for more details about this project.

Install

Install as A Python Package

We are now working on deploy OpenNRE as a Python package. Coming soon!

Using Git Repository

Clone the repository from our github page (don't forget to star us!)

git clone https://github.com/thunlp/OpenNRE.git

If it is too slow, you can try

git clone https://github.com/thunlp/OpenNRE.git --depth 1

Then install all the requirements:

pip install -r requirements.txt

Note: Please choose appropriate PyTorch version based on your machine (related to your CUDA version). For details, refer to https://pytorch.org/.

Then install the package with

python setup.py install

If you also want to modify the code, run this:

python setup.py develop

Note that we have excluded all data and pretrain files for fast deployment. You can manually download them by running scripts in the benchmark and pretrain folders. For example, if you want to download FewRel dataset, you can run

bash benchmark/download_fewrel.sh

Easy Start

Make sure you have installed OpenNRE as instructed above. Then import our package and load pre-trained models.

>>> import opennre
>>> model = opennre.get_model('wiki80_cnn_softmax')

Note that it may take a few minutes to download checkpoint and data for the first time. Then use infer to do sentence-level relation extraction

>>> model.infer({'text': 'He was the son of Máel Dúin mac Máele Fithrich, and grandson of the high king Áed Uaridnach (died 612).', 'h': {'pos': (18, 46)}, 't': {'pos': (78, 91)}})
('father', 0.5108704566955566)

You will get the relation result and its confidence score.

For now, we have the following available models:

wiki80_cnn_softmax: trained on wiki80 dataset with a CNN encoder.
wiki80_bert_softmax: trained on wiki80 dataset with a BERT encoder.
wiki80_bertentity_softmax: trained on wiki80 dataset with a BERT encoder (using entity representation concatenation).
tacred_bert_softmax: trained on TACRED dataset with a BERT encoder.
tacred_bertentity_softmax: trained on TACRED dataset with a BERT encoder (using entity representation concatenation).

Training

You can train your own models on your own data with OpenNRE. In example folder we give example training codes for supervised RE models and bag-level RE models. You can either use our provided datasets or your own datasets.

Google Group

If you want to receive our update news or take part in discussions, please join our Google Group

An Open-Source Package for Neural Relation Extraction (NRE)

Related tags

Overview

OpenNRE

What is Relation Extraction

How to Cite

Papers and Document

Install

Install as A Python Package

Using Git Repository

Easy Start

Training

Google Group

Owner

THUNLP

Blender addon - Scrub timeline from viewport with a shortcut

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Poetry PEP 517 Build Backend & Core Utilities

This repository implements a brute-force spellchecker utilizing the Damerau-Levenshtein edit distance.

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Some embedding layer implementation using ivy library

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering.

a CTF web challenge about making screenshots

中文問句產生器；使用台達電閱讀理解資料集(DRCD)

TalkNet: Audio-visual active speaker detection Model

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

Code for producing Japanese GPT-2 provided by rinna Co., Ltd.

Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch

STT for TorchScript is a port of Coqui STT based on DeepSpeech to PyTorch.

This program do translate english words to portuguese

Easy-to-use CPM for Chinese text generation

NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess

BPEmb is a collection of pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) and trained on Wikipedia.

PyTorch implementation of the NIPS-17 paper "Poincaré Embeddings for Learning Hierarchical Representations"

使用Mask LM预训练任务来预训练Bert模型。训练垂直领域语料的模型表征，提升下游任务的表现。