RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

Related tags

Deep LearningRuleBert
Overview

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

(Paper) (Slides) (Video)

RuleBERT reasons over Natural Language

RuleBERT is a pre-trained language model that has been fine-tuned on soft logical results. This repo contains the required code for running the experiments of the associated paper.

Installation

0. Clone Repo

git clone https://github.com/MhmdSaiid/RuleBert
cd RuleBERT

1. Create virtual env and install reqs

(optional) virtualenv -m python RuleBERT
pip install -r requirements.txt

2. Download Data

The datasets can be found here. (DISCLAIMER: ~25 GB on disk)

You can also run:

bash download_datasets.sh

Run Experiments

When an experiemnt is complete, the model, the tokenizer, and the results are stored in models/**timestamp**.

i) Single Rules

bash experiments/single_rules/SR.sh data/single_rules 

ii) Rule Union Experiment

bash experiments/union_rules/UR.sh data/union_rules 

iii) Rule Chain Experiment

bash experiments/chain_rules/CR.sh data/chain_rules 

iv) External Datasets

Generate Your Own Data

You can generate your own data for a single rule, a union of rules sharing the same rule head, or a chain of rules.

First, make sure you are in the correct directory.

cd data_generation

1) Single Rule

There are two ways to data for a single rule:

i) Pass Data through Arguments

python DataGeneration.py 
       --rule 'spouse(A,B) :- child(A,B).' 
       --pool_list "[['Anne', 'Bob', 'Charlie'],
                    ['Frank', 'Gary', 'Paul']]" 
       --rule_support 0.67
  • --rule : The rule in string format. Consult here to see how to write a rule.
  • --pool_list : For every variable in the rule, we include a list of possible instantiations.
  • --rule_support : A float representing the rule support. If not specified, rule defaults to a hard rule.
  • --max_num_facts : Maximum number of facts in a generated theory.
  • --num : Total number of theories per generated (rule,facts).
  • --TWL : When called, we use three-way-logic instead of negation as failure. Unsatisifed predicates are no longer considered False.
  • --complementary_rules : A string of complementary rules to add.
  • --p_bar : Boolean to show a progress bar. Deafults to True.

ii) Pass a JSON file

This is more convenient for when rules are long or when there are multiple rules. The JSON file specifies the rule(s), pool list(s), and rule support(s). It is passed as an argument.

python DataGeneration.py --rule_json r1.jsonl

2) Union of Rules

For a union of rules sharing the same rule-head predicate, we pass a JSON file to the command that contaains rules with overlapping rule-head predicates.

python DataGeneration.py --rule_json Multi_rule.json 
                         --type union

--type is used to indicate which type of data generation method should be set to. For a union of rules, we use --type union. If --type single is used, we do single-rule data generation for each rule in the file.

3) Chained Rules

For a chain of rules, the json file should include rules that could be chained together.

python DataGeneration.py --rule_json chain_rules.json 
                         --type chain

The chain depth defaults to 5 --chain_depth 5.

Train your Own Model

To fine-tune the model, run:

# train
python trainer.py --data-dir data/R1/
                  --epochs 3
                  --verbose

When complete, the model and tokenizer are saved in models/**timestamp**.

To test the model, run:

# test
python tester.py --test_data_dir data/test_R1/
                 --model_dir models/**timestamp**
                 --verbose

A JSON file will be saved in model_dir containing the results.

Contact Us

For any inquiries, feel free to contact us, or raise an issue on Github.

Reference

You can cite our work:

@inproceedings{saeed-etal-2021-rulebert,
    title = "{R}ule{BERT}: Teaching Soft Rules to Pre-Trained Language Models",
    author = "Saeed, Mohammed  and
      Ahmadi, Naser  and
      Nakov, Preslav  and
      Papotti, Paolo",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.110",
    pages = "1460--1476",
    abstract = "While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge. In fact, even if information is available in the form of approximate (soft) logical rules, it is not clear how to transfer it to a PLM in order to improve its performance for deductive reasoning tasks. Here, we aim to bridge this gap by teaching PLMs how to reason with soft Horn rules. We introduce a classification task where, given facts and soft rules, the PLM should return a prediction with a probability for a given hypothesis. We release the first dataset for this task, and we propose a revised loss function that enables the PLM to learn how to predict precise probabilities for the task. Our evaluation results show that the resulting fine-tuned models achieve very high performance, even on logical rules that were unseen at training. Moreover, we demonstrate that logical notions expressed by the rules are transferred to the fine-tuned model, yielding state-of-the-art results on external datasets.",
}

License

MIT

Owner
“If a machine is expected to be infallible, it cannot also be intelligent.” ― Alan Turing
Language-Agnostic Website Embedding and Classification

Homepage2Vec Language-Agnostic Website Embedding and Classification based on Curlie labels https://arxiv.org/pdf/2201.03677.pdf Homepage2Vec is a pre-

25 Dec 27, 2022
This repository contains the code and models for the following paper.

DC-ShadowNet Introduction This is an implementation of the following paper DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised

AuAgCu 65 Dec 27, 2022
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 04, 2023
Learning cell communication from spatial graphs of cells

ncem Features Repository for the manuscript Fischer, D. S., Schaar, A. C. and Theis, F. Learning cell communication from spatial graphs of cells. 2021

Theis Lab 77 Dec 30, 2022
This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

HCSC: Hierarchical Contrastive Selective Coding This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive

YUANFAN GUO 111 Dec 20, 2022
Simple cross-platform application for DaVinci surgical video frame annotation

About DaVid is a simple cross-platform GUI for annotating robotic and endoscopic surgical actions for use in deep-learning research. Features Simple a

Cyril Zakka 4 Oct 09, 2021
EMNLP 2021 Findings' paper, SCICAP: Generating Captions for Scientific Figures

SCICAP: Scientific Figures Dataset This is the Github repo of the EMNLP 2021 Findings' paper, SCICAP: Generating Captions for Scientific Figures (Hsu

Edward 26 Nov 21, 2022
A paper using optimal transport to solve the graph matching problem.

GOAT A paper using optimal transport to solve the graph matching problem. https://arxiv.org/abs/2111.05366 Repo structure .github: Files specifying ho

neurodata 8 Jan 04, 2023
PyTorch implementation of a Real-ESRGAN model trained on custom dataset

Real-ESRGAN PyTorch implementation of a Real-ESRGAN model trained on custom dataset. This model shows better results on faces compared to the original

Sber AI 160 Jan 04, 2023
DyNet: The Dynamic Neural Network Toolkit

The Dynamic Neural Network Toolkit General Installation C++ Python Getting Started Citing Releases and Contributing General DyNet is a neural network

Chris Dyer's lab @ LTI/CMU 3.3k Jan 06, 2023
library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

NLopt is a library for nonlinear local and global optimization, for functions with and without gradient information. It is designed as a simple, unifi

Steven G. Johnson 1.4k Dec 25, 2022
This repo is a C++ version of yolov5_deepsort_tensorrt. Packing all C++ programs into .so files, using Python script to call C++ programs further.

yolov5_deepsort_tensorrt_cpp Introduction This repo is a C++ version of yolov5_deepsort_tensorrt. And packing all C++ programs into .so files, using P

41 Dec 27, 2022
(NeurIPS 2020) Wasserstein Distances for Stereo Disparity Estimation

Wasserstein Distances for Stereo Disparity Estimation Accepted in NeurIPS 2020 as Spotlight. [Project Page] Wasserstein Distances for Stereo Disparity

Divyansh Garg 92 Dec 12, 2022
The code used for the free [email protected] Webinar series on Reinforcement Learning in Finance

Reinforcement Learning in Finance [email protected] Webinar This repository provides the code f

Yves Hilpisch 62 Dec 22, 2022
NVIDIA Deep Learning Examples for Tensor Cores

NVIDIA Deep Learning Examples for Tensor Cores Introduction This repository provides State-of-the-Art Deep Learning examples that are easy to train an

NVIDIA Corporation 10k Dec 31, 2022
Bunch of different tools which helps visualizing and annotating images for semantic/instance segmentation tasks

Data Framework for Semantic/Instance Segmentation Bunch of different tools which helps visualizing, transforming and annotating images for semantic/in

Bruno Fernandes Carvalho 5 Dec 21, 2022
“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository

<a href=[email protected]"> 18 Sep 10, 2022
EGNN - Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch

EGNN - Pytorch Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch. May be eventually used for Alphafold2 replication. This

Phil Wang 259 Jan 04, 2023
Optimized code based on M2 for faster image captioning training

Transformer Captioning This repository contains the code for Transformer-based image captioning. Based on meshed-memory-transformer, we further optimi

lyricpoem 16 Dec 16, 2022
Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Tom-R.T.Kvalvaag 2 Dec 17, 2021