[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

Overview

MuVER

This repo contains the code and pre-trained model for our EMNLP 2021 paper:
MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations. Xinyin Ma, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Weiming Lu

Quick Start

1. Requirements

The requirements for our code are listed in requirements.txt, install the package with the following command:

pip install -r requirements.txt

For huggingface/transformers, we tested it under version 4.1.X and 4.2.X.

2. Download data and model

3. Use the released model to reproduce our results

  • Without View Merging:
export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --bi_ckpt_path path_to_model/best_zeshel.bin 
    --max_cand_len 40 
    --max_seq_len 128
    --do_test 
    --test_mode test 
    --data_parallel 
    --eval_batch_size 16
    --accumulate_score

Expected Result:

World [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
forgotten_realms 0.6208 0.7783 0.8592 0.8983 0.9342 0.9533 0.9633 0.9700
lego 0.4904 0.6714 0.7690 0.8357 0.8791 0.9091 0.9208 0.9249
star_trek 0.4743 0.6130 0.6967 0.7606 0.8159 0.8581 0.8805 0.8919
yugioh 0.3432 0.4861 0.6040 0.7004 0.7596 0.8201 0.8512 0.8672
total 0.4496 0.5970 0.6936 0.7658 0.8187 0.8628 0.8854 0.8969
  • With View Merging:
export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --bi_ckpt_path path_to_model/best_zeshel.bin 
    --max_cand_len 40 
    --max_seq_len 128 
    --do_test 
    --test_mode test 
    --data_parallel 
    --eval_batch_size 16
    --accumulate_score
    --view_expansion  
    --merge_layers 4  
    --top_k 0.4

Expected result:

World [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
forgotten_realms 0.6175 0.7867 0.8733 0.9150 0.9375 0.9600 0.9675 0.9708
lego 0.5046 0.6889 0.7882 0.8449 0.8882 0.9183 0.9324 0.9374
star_trek 0.4810 0.6253 0.7121 0.7783 0.8271 0.8706 0.8935 0.9030
yugioh 0.3444 0.5027 0.6322 0.7300 0.7902 0.8429 0.8690 0.8826
total 0.4541 0.6109 0.7136 0.7864 0.8352 0.8777 0.8988 0.9084

Optional Argument:

  • --data_parallel: whether you want to use multiple gpus.
  • --accumulate_score: accumulate score for each entity. Obtain a higher score but will take much time to inference.
  • --view_expansion: whether you want to merge and expand view.
  • --top_k: top_k pairs are expected to merge in each layer.
  • --merge_layers: the number of layers for merging.
  • --test_mode: If you want to generate candidates for train/dev set, change the test_mode to train or dev, which will generate candidates outputs and save it under the directory where you save the test model.

4. How to train your MuVER

We provice the code to train your MuVER. Train the code with the following command:

export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --epoch 30 
    --train_batch_size 128 
    --learning_rate 1e-5 
    --do_train --do_eval 
    --data_parallel 
    --name distributed_multi_view

Important: Since constrastive learning relies heavily on a large batch size, as reported in our paper, we use eight v100(16g) to train our model. The hyperparameters for our best model are in logs/zeshel_hyper_param.txt

The code will create a directory runtime_log to save the log, model and the hyperparameter you used. Everytime you trained your model(with or without grid search), it will create a directory under runtime_log/name_in_your_args/start_time, e.g., runtime_log/distributed_multi_view/2021-09-07-15-12-21, to store all the checkpoints, curve for visualization and the training log.

[ICCV 2021] Learning A Single Network for Scale-Arbitrary Super-Resolution

ArbSR Pytorch implementation of "Learning A Single Network for Scale-Arbitrary Super-Resolution", ICCV 2021 [Project] [arXiv] Highlights A plug-in mod

Longguang Wang 229 Dec 30, 2022
Notepy is a full-featured Notepad Python app

Notepy A full featured python text-editor Notable features Autocompletion for parenthesis and quote Auto identation Syntax highlighting Compile and ru

Mirko Rovere 11 Sep 28, 2022
This repo is for segmentation of T2 hyp regions in gliomas.

T2-Hyp-Segmentor This repo is for segmentation of T2 hyp regions in gliomas. By downloading the model from here you can use it to segment your T2w ima

1 Jan 18, 2022
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Fastformer-PyTorch Unofficial PyTorch implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Usage : import t

Hong-Jia Chen 126 Dec 06, 2022
【Arxiv】Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

SANet Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 to

36 Jan 05, 2023
PyTorch implementation of SQN based on CloserLook3D's encoder

SQN_pytorch This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, che

PointCloudYC 1 Oct 21, 2021
This porject is intented to build the most accurate model for predicting the porbability of loan default

Estimating-Loan-Default-Probability IBA ML2 Mid-project / Kaggle Competition This porject is intented to build the most accurate model for predicting

Adil Gahramanov 1 Jan 24, 2022
MogFace: Towards a Deeper Appreciation on Face Detection

MogFace: Towards a Deeper Appreciation on Face Detection Introduction In this repo, we propose a promising face detector, termed as MogFace. Our MogFa

48 Dec 20, 2022
Estimating Example Difficulty using Variance of Gradients

Estimating Example Difficulty using Variance of Gradients This repository contains source code necessary to reproduce some of the main results in the

Chirag Agarwal 48 Dec 26, 2022
Implementation of Convolutional LSTM in PyTorch.

ConvLSTM_pytorch This file contains the implementation of Convolutional LSTM in PyTorch made by me and DavideA. We started from this implementation an

Andrea Palazzi 1.3k Dec 29, 2022
Biomarker identification for COVID-19 Severity in BALF cells Single-cell RNA-seq data

scBALF Covid-19 dataset Analysis Here is the Github page that has the codes for the bioinformatics pipeline described in the paper COVID-Datathon: Bio

Nami Niyakan 2 May 21, 2022
Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

Machine_Learning Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning This project is based on 2 case-studies:

Avnika Mehta 1 Jan 27, 2022
Gauge equivariant mesh cnn

Geometric Mesh CNN The code in this repository is an implementation of the Gauge Equivariant Mesh CNN introduced in the paper Gauge Equivariant Mesh C

50 Dec 18, 2022
PyTorch Implementation of Backbone of PicoDet

PicoDet-Backbone PyTorch Implementation of Backbone of PicoDet Original Implementation is implemented on PaddlePaddle. Example picodet_l_backbone = ES

Yonghye Kwon 7 Jul 12, 2022
Node for thenewboston digital currency network.

Project setup For project setup see INSTALL.rst Community Join the community to stay updated on the most recent developments, project roadmaps, and ra

thenewboston 27 Jul 08, 2022
R interface to fast.ai

R interface to fastai The fastai package provides R wrappers to fastai. The fastai library simplifies training fast and accurate neural nets using mod

113 Dec 20, 2022
Pytorch implementation of Compressive Transformers, from Deepmind

Compressive Transformer in Pytorch Pytorch implementation of Compressive Transformers, a variant of Transformer-XL with compressed memory for long-ran

Phil Wang 118 Dec 01, 2022
Keras-1D-NN-Classifier

Keras-1D-NN-Classifier This code is based on the reference codes linked below. reference 1, reference 2 This code is for 1-D array data classification

Jae-Hoon Shim 6 May 18, 2021
Final project for machine learning (CSC 590). Detection of hepatitis C and progression through blood samples.

Hepatitis C Blood Based Detection Final project for machine learning (CSC 590). Dataset from Kaggle. Using data from previous hepatitis C blood panels

Jennefer Maldonado 1 Dec 28, 2021
Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras

Patches Are All You Need? - ConvMixer ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in t

Sayan Nath 8 Oct 03, 2022