The source code for 'Noisy-Labeled NER with Confidence Estimation' accepted by NAACL 2021

Last update: Nov 12, 2022

Overview

Kun Liu*, Yao Fu*, Chuanqi Tan, Mosha Chen, Ningyu Zhang, Songfang Huang, Sheng Gao. Noisy-Labeled NER with Confidence Estimation. NAACL 2021. [arxiv]

Requirements

pip install -r requirements.txt

Data

The format of datasets includes three columns, the first column is word, the second column is noisy labels and the third column is gold labels. For datasets without golden labels, you could set the third column the same as the second column. We provide the CoNLL 2003 English with recall 0.5 and precision 0.9 in './data/eng_r0.5p0.9'

Confidence Estimation Strategies

Local Strategy

python confidence_estimation_local.py --dataset eng_r0.5p0.9 --embedding_file ${PATH_TO_EMBEDDING} --embedding_dim ${DIM_OF_EMBEDDING} --neg_noise_rate ${NOISE_RATE_OF_NEGATIVES} --pos_noise_rate ${NOISE_RATE_OF_POSITIVES}

For '--neg_noise_rate' and '--pos_noise_rate', you can set them as -1.0 to use golden noise rate (experiment 12 in Table 1 For En), or you can set them as other values (i.e., --neg_noise_rate 0.09 --pos_noise_rate 0.14 for experiment 10, En)

Global Strategy

python confidence_estimation_global.py --dataset eng_r0.5p0.9 --embedding_file ${PATH_TO_EMBEDDING} --embedding_dim ${DIM_OF_EMBEDDING} --neg_noise_rate ${NOISE_RATE_OF_NEGATIVES} --pos_noise_rate ${NOISE_RATE_OF_POSITIVES}

For 'neg_noise_rate' and 'pos_noise_rate', you can set them as -1.0 to use golden noise rate (experiment 13 in Table 1 for En), or you can set them as other values (i.e., --neg_noise_rate 0.1 --pos_noise_rate 0.13 for experiment 11, En)

Key Implementation

equation (3) is implemented in ./model/linear_partial_crf_inferencer.py, line 79-85.

equation (4) is implemented in ./model/neuralcrf_small_loss_constrain_local.py, line 139.

equation (5) is implemented in ./confidence_estimation_local.py, line 74-87 or ./confidence_estimation_global.py, line 75-85.

equation (6) and (7) are implemented in ./model/neuralcrf_small_loss_constrain_global.py, line 188-194 or ./model/neuralcrf_small_loss_constrain_local.py, line 188-197.

For global strategy, equation (8) is implemented in ./model/neuralcrf_small_loss_constrain_global.py, line 195-214 and ./model/linear_partial_crf_inferencer.py, line 36-48. For local strategy, equation (8) is implemented in ./model/neuralcrf_small_loss_constrain_local.py, line 198-215 and ./model/linear_crf_inferencer.py, line 36-48.

The source code for 'Noisy-Labeled NER with Confidence Estimation' accepted by NAACL 2021

Related tags

Overview

Requirements

Data

Confidence Estimation Strategies

Local Strategy

Global Strategy

Key Implementation

Owner

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Code related to the manuscript "Averting A Crisis In Simulation-Based Inference"

Train emoji embeddings based on emoji descriptions.

CVPR 2021 Challenge on Super-Resolution Space

Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

Self Governing Neural Networks (SGNN): the Projection Layer

PClean: A Domain-Specific Probabilistic Programming Language for Bayesian Data Cleaning

Neural-net-from-scratch - A simple Neural Network from scratch in Python using the Pymathrix library

iNAS: Integral NAS for Device-Aware Salient Object Detection

Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

Official repository for the paper "Instance-Conditioned GAN"

The official code repository for examples in the O'Reilly book 'Generative Deep Learning'

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

A Simulated Optimal Intrusion Response Game

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network

The repository contain code for building compiler using puthon.