Bert Axioms

This is the repository with the code for the Paper Diagnosing BERT with Retrieval Heuristics

Required Data

In order to run this code, you first need to download the dataset from the TREC 2019 Deep Learning Track Guidelines. The path for these should be specified in the config file

You also need a working installation of the Indri Toolkit for indexing and retrieval.

Parameters

There are a number of hyperparemeter that need to be set (like indri path, number of candidates to be retrieved, random seed etc). These can be set on a config YAML file at scripts/config-defaults.yaml. The parameters are handled by wandb, but can easily be addapted for any YAML reader (take a look at PyYAML.)

Observations

Note that, for LNC2, we use an external C++ code for dealing with Indri. This is so we can add the duplicated documents to the index without comprimissing scores. This code should be compiled with Indri's Makefile.app. This should be as easy as edditing Makefile.app from Indri and running make -f Makefile.app. (Check https://lemur.sourceforge.io/indri/ for more details).

The removal process of documents from the indri index does not guarantee that the index statistics will change immediately. This can cause slight differences than the more "correct" way to re-create the index from scratch for every duplicated document.

Expected Results

The results from this repository may not directly replicate the ones that appear on the paper. This is due to a few performance improvements made after the paper submission. These, however, do not change the final scores and conclusions. Mostly, you may see a increase on alpha-nDCG for all methods, and a increase on QL performance accross the board.

	`nDCG_cut`	`TFCI`	`TFCII`	`MTDC`	`LNC1`	`LNC2`	`TP`	`STMC1`	`STMC2`	`STMC3`
QL	0.3633	0.9936	0.7008	0.8759	0.5021	1.000	0.3852	0.4855	0.7047	0.7011
DistilBERT	0.4537	0.6109	0.3945	0.5130	0.5006	0.0003	0.4105	0.5040	0.5120	0.5099

Code for ECIR'20 paper Diagnosing BERT with Retrieval Heuristics

Related tags

Overview

Bert Axioms

Required Data

Parameters

Observations

Expected Results

Owner

Arthur Câmara

Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation

DvD-TD3: Diversity via Determinants for TD3 version

An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

simple demo codes for Learning to Teach with Dynamic Loss Functions

Large-Scale Unsupervised Object Discovery

In generative deep geometry learning, we often get many obj files remain to be rendered

Machine Translation Implement By Bi-GRU And Transformer

DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

[Link]deep_portfolo - Use Reforcemet earg ad Supervsed learg to Optmze portfolo allocato []

Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

High-Resolution 3D Human Digitization from A Single Image.

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Uni-Fold: Training your own deep protein-folding models

Pytorch implementation for the paper: Contrastive Learning for Cold-start Recommendation

Some experiments with tennis player aging curves using Hilbert space GPs in PyMC. Only experimental for now.

Build a medical knowledge graph based on Unified Language Medical System (UMLS)

Code release for "Making a Bird AI Expert Work for You and Me".

Voice of Pajlada with model and weights.

A python bot to move your mouse every few seconds to appear active on Skype, Teams or Zoom as you go AFK. 🐭 🤖

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch