Code To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment.

Related tags

Deep Learningcoliee
Overview

COLIEE 2021 - task 2: Legal Case Entailment

This repository contains the code to reproduce NeuralMind's submissions to COLIEE 2021 presented in the paper To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment. There has been mounting evidence that pretrained language models fine-tuned on large and diverse supervised datasets can transfer well to a variety of out-of-domain tasks. In this work, we investigate this transfer ability to the legal domain. For that, we participated in the legal case entailment task of COLIEE 2021, in which we use such models with no adaptations to the target domain. Our submissions achieved the highest scores, surpassing the second-best submission by more than six percentage points. Our experiments confirm a counter-intuitive result in the new paradigm of pretrained language models: that given limited labeled data, models with little or no adaption to the target task can be more robust to changes in the data distribution and perform better on held-out datasets than models fine-tuned on it.

Models

monoT5-zero-shot: We use a model T5 Large fine-tuned on MS MARCO, a dataset of approximately 530k query and relevant passage pairs. We use a checkpoint available at Huggingface’smodel hub that was trained with a learning rate of 10−3 using batches of 128 examples for 10k steps, or approximately one epoch of the MS MARCO dataset. In each batch, a roughly equal number of positive and negative examples are sampled.

monoT5: We further fine-tune monoT5-zero-shot on the COLIEE 2020 training set following a similar training procedure described for monoT5-zero-shot. The model is fine-tuned with a learning rate of 10−3 for 80 steps using batches of size 128, which corresponds to 20 epochs. Each batch has the same number of positive and negative examples.

DeBERTa: Decoding-enhanced BERT with disentangled attention(DeBERTa) improves on the original BERT and RoBERTa architectures by introducing two techniques: the disentangled attention mechanism and an enhanced mask decoder. Both improvements seek to introduce positional information to the pretraining procedure, both in terms of the absolute position of a token and the relative position between them. We fine-tune DeBERTa on the COLIEE 2020 training set following a similar training procedure described for monoT5.

DebertaT5 (Ensemble): We use the following method to combine the predictions of monoT5 and DeBERTa (both fine-tuned on COLIEE 2020 dataset): We concatenate the final set of paragraphs selected by each model and remove duplicates, preserving the highest score. It is important to note that our method does not combine scores between models. The final answer for each test example is composed of individual answers from one or both models. It ensures that only answers with a certain degree of confidence are maintained, which generally leads to an increase in precision.

Results

Model Train data Evaluation F1 Description
Median of submissions Coliee 58.60
Coliee 2nd best team Coliee 62.74
DeBERTa (ours) Coliee Coliee 63.39 Single model
monoT5 (ours) Coliee Coliee 66.10 Single model
monoT5-zero-shot (ours) MS Marco Coliee 68.72 Single model
DebertaT5 (ours) Coliee Coliee 69.12 Ensemble

In this table, we present the results. Our main finding is that our zero-shot model achieved the best result of a single model on 2021 test data, outperforming DeBERTa and monoT5, which were fine-tuned on the COLIEE dataset. As far as we know, this is the first time that a zero-shot model outperforms fine-tuned models in the task of legal case entailment. Given limited annotated data for fine-tuning and a held-out test data, such as the COLIEE dataset, our results suggest that a zero-shot model fine-tuned on a large out-of-domain dataset may be more robust to changes in data distribution and may generalize better on unseen data than models fine-tuned on a small domain-specific dataset. Moreover, our ensemble method effectively combines DeBERTa and monoT5 predictions,achieving the best score among all submissions (row 6). It is important to note that despite the performance of DebertaT5 being the best in the COLIEE competition, the ensemble method requires training time, computational resources and perhaps also data augmentation to perform well on the task, while monoT5-zero-shot does not need any adaptation. The model is available online and ready to use.

Conclusion

Based on those results, we question the common assumption that it is necessary to have labeled training data on the target domain to perform well on a task. Our results suggest that fine-tuning on a large labeled dataset may be enough.

How do I get the dataset?

Those who wish to use previous COLIEE data for a trial, please contact rabelo(at)ualberta.ca.

How do I evaluate?

As our best model is a zero-shot one, we provide only the evaluation script.

References

[1] Document Ranking with a Pretrained Sequence-to-Sequence Model

[2] DeBERTa: Decoding-enhanced BERT with Disentangled Attention

[3] ICAIL '21: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law

[4] Proceedings of the Eigth International Competition on Legal Information Extraction/Entailment

How do I cite this work?

 @article{to_tune,
    title={To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment},
    author={Moraes, Guilherme and Rodrigues, Ruan and Lotufo, Roberto and Nogueira, Rodrigo},
    journal={ICAIL '21: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law June 2021 Pages 295–300},
    url={https://dl.acm.org/doi/10.1145/3462757.3466103},
    year={2021}
}
Owner
NeuralMind
Deep Learning for NLP and image processing
NeuralMind
AirLoop: Lifelong Loop Closure Detection

AirLoop This repo contains the source code for paper: Dasong Gao, Chen Wang, Sebastian Scherer. "AirLoop: Lifelong Loop Closure Detection." arXiv prep

Chen Wang 53 Jan 03, 2023
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

386 Jan 01, 2023
Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

118 Dec 12, 2022
FFCV: Fast Forward Computer Vision (and other ML workloads!)

Fast Forward Computer Vision: train models at a fraction of the cost with accele

FFCV 2.3k Jan 03, 2023
Official Repository of NeurIPS2021 paper: PTR

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning Figure 1. Dataset Overview. Introduction A critical aspect of human vis

Yining Hong 32 Jun 02, 2022
MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition Paper: MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition accepted fo

64 Dec 18, 2022
PyTorch - Python + Nim

Master Release Pytorch - Py + Nim A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen. Because Nim compiles to C+

Giovanni Petrantoni 425 Dec 22, 2022
This is the code of "Multi-view Contrastive Graph Clustering" in NeurlPS 2021.

MCGC Description This is the code of "Multi-view Contrastive Graph Clustering" in NeurlPS 2021. Datasets Results ACM DBLP IMDB Amazon photos Amazon co

31 Nov 14, 2022
BEGAN in PyTorch

BEGAN in PyTorch This project is still in progress. If you are looking for the working code, use BEGAN-tensorflow. Requirements Python 2.7 Pillow tqdm

Taehoon Kim 260 Dec 07, 2022
Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

ReLU-GP Residual (RGPR) This repository contains code for reproducing the following NeurIPS 2021 paper: @inproceedings{kristiadi2021infinite, title=

Agustinus Kristiadi 4 Dec 26, 2021
A Streamlit demo demonstrating the Deep Dream technique. Adapted from the TensorFlow Deep Dream tutorial.

Streamlit Demo: Deep Dream A Streamlit demo demonstrating the Deep Dream technique. Adapted from the TensorFlow Deep Dream tutorial How to run this de

Streamlit 11 Dec 12, 2022
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

ROSITA News & Updates (24/08/2021) Release the demo to perform fine-grained semantic alignments using the pretrained ROSITA model. (15/08/2021) Releas

Vision and Language Group@ MIL 48 Dec 23, 2022
COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping Version 1.0 COVINS is an accurate, scalable, and versatile vis

ETHZ V4RL 183 Dec 27, 2022
AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

AOT-GAN for High-Resolution Image Inpainting Arxiv Paper | AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting Yanhong

Multimedia Research 214 Jan 03, 2023
Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Auto-ViML Automatically Build Variant Interpretable ML models fast! Auto_ViML is pronounced "auto vimal" (autovimal logo created by Sanket Ghanmare) N

AutoViz and Auto_ViML 397 Dec 30, 2022
Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

Boundary-Guided-Context-Aggregation Boundary Guided Context Aggregation for Semantic Segmentation Haoxiang Ma, Hongyu Yang, Di Huang In BMVC'2021 Pape

Haoxiang Ma 31 Jan 08, 2023
Cmsc11 arcade - Final Project for CMSC11

cmsc11_arcade Final Project for CMSC11 Developers: Limson, Mark Vincent Peñafiel

Gregory 1 Jan 18, 2022
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

64 Jan 05, 2023
PyVideoAI: Action Recognition Framework

This reposity contains official implementation of: Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action Recognitio

Kiyoon Kim 22 Dec 29, 2022
This repository contains the implementation of the paper: Federated Distillation of Natural Language Understanding with Confident Sinkhorns

Federated Distillation of Natural Language Understanding with Confident Sinkhorns This repository provides an alternative method for ensembled distill

Deep Cognition and Language Research (DeCLaRe) Lab 11 Nov 16, 2022