git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Overview

Commonsense Knowledge Base Completion with Structural and Semantic Context

Code for the paper Commonsense Knowledge Base Completion with Structural and Semantic Context.

Bibtex

@article{malaviya2020commonsense,
  title={Commonsense Knowledge Base Completion with Structural and Semantic Context},
  author={Malaviya, Chaitanya and Bhagavatula, Chandra and Bosselut, Antoine and Choi, Yejin},
  journal={Proceedings of the 34th AAAI Conference on Artificial Intelligence},
  year={2020}
}

Requirements

  • PyTorch
  • Run pip install -r requirements.txt to install the required packages.

Dataset

The ATOMIC dataset used in this paper is available here and the ConceptNet graph is available here. For convenience, both the pre-processed version of ATOMIC and ConceptNet used in the experiments are provided at this link.

Note: The ATOMIC dataset was pre-processed to canonicalize person references and remove punctuations (described in preprocess_atomic.py.

Note: The original evaluation sets provided in the ConceptNet dataset contain correct as well as incorrect tuples for evaluating binary classification accuracy. valid.txt in data/conceptnet is the concatenation of the correct tuples from the two development sets provided in the original dataset while test.txt is the set of correct tuples from the original test set.

Training

To train a model, run the following command:

python -u src/run_kbc_subgraph.py --dataset conceptnet --evaluate-every 10 --n-layers 2 --graph-batch-size 60000 --sim_relations --bert_concat

This trains the model and saves the model under the saved_models directory.

Language Model Fine-tuning

In this work, we use representations from a BERT model fine-tuned to the language of the nodes in the knowledge graph.

The script to fine-tune BERT as a language model on the two knowledge graphs is present in the lm_finetuning/ directory. For example, here is a command to fine-tune BERT as a language model on ConceptNet:

python lm_finetuning/simple_lm_finetuning.py --train_corpus {CONCEPTNET_TRAIN_CORPUS} --bert_model bert-large-uncased --output_dir {OUTPUT_DIR}

Pre-Trained Models

We provide the fine-tuned BERT models and pre-computed BERT embeddings for both ConceptNet and ATOMIC at this link. If you unzip the downloaded file in the root directory of the repository, the training script will load the embeddings.

We also provide the pre-trained KB completion models for both datasets for ease of use. Link to Conceptnet model and ATOMIC model.

Evaluation

To evaluate a trained model, and get predictions, provide the model path to the --load_model argument and use the --eval_only argument. For example, to evaluate the pre-trained ConceptNet model provided above, use the following command:

CUDA_VISIBLE_DEVICES={GPU_ID} python src/run_kbc_subgraph.py --dataset conceptnet --sim_relations --bert_concat --use_bias --load_model {PATH_TO_PRETRAINED_MODEL} --eval_only --write_results

This will load the pre-trained model, and evaluate it on the validation and test set. The predictions are saved to ./topk_results.json.

Similarly, to evaluate the trained model on ATOMIC, use the following command:

CUDA_VISIBLE_DEVICES={GPU_ID} python src/run_kbc_subgraph.py --dataset atomic --sim_relations --use_bias --load_model {PATH_TO_PRETRAINED_MODEL} --eval_only --write_results

Please email me at [email protected] for any questions or comments.

Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

Neural Distance Embeddings for Biological Sequences Official implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTo

Gabriele Corso 56 Dec 23, 2022
Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

Xinyi Wang 21 May 18, 2022
Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

SA-Net: Shuffle Attention for Deep Convolutional Neural Networks (paper) By Qing-Long Zhang and Yu-Bin Yang [State Key Laboratory for Novel Software T

Qing-Long Zhang 199 Jan 08, 2023
Data pipelines for both TensorFlow and PyTorch!

rapidnlp-datasets Data pipelines for both TensorFlow and PyTorch ! If you want to load public datasets, try: tensorflow/datasets huggingface/datasets

1 Dec 08, 2021
A font family with a great monospaced variant for programmers.

Fantasque Sans Mono A programming font, designed with functionality in mind, and with some wibbly-wobbly handwriting-like fuzziness that makes it unas

Jany Belluz 6.3k Jan 08, 2023
Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

Spchcat Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi. Description spchcat is a command-line tool that read

Pete Warden 279 Jan 03, 2023
This repository includes code of my study about Asynchronous in Frequency domain of GAN images.

Exploring the Asynchronous of the Frequency Spectra of GAN-generated Facial Images Binh M. Le & Simon S. Woo, "Exploring the Asynchronous of the Frequ

4 Aug 06, 2022
Simple data balancing baselines for worst-group-accuracy benchmarks.

BalancingGroups Code to replicate the experimental results from Simple data balancing baselines achieve competitive worst-group-accuracy. Replicating

Meta Research 29 Dec 02, 2022
Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training"

Saliency Guided Training Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training" by Aya Abdelsalam Ismail, Hector Cor

8 Sep 22, 2022
A pytorch-based real-time segmentation model for autonomous driving

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation This project contains the Pytorch implementation for the proposed CFPNet: pap

342 Dec 22, 2022
Label Hallucination for Few-Shot Classification

Label Hallucination for Few-Shot Classification This repo covers the implementation of the following paper: Label Hallucination for Few-Shot Classific

Yiren Jian 13 Nov 13, 2022
Very Deep Convolutional Networks for Large-Scale Image Recognition

pytorch-vgg Some scripts to convert the VGG-16 and VGG-19 models [1] from Caffe to PyTorch. The converted models can be used with the PyTorch model zo

Justin Johnson 217 Dec 05, 2022
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022 [Project page | Video] Getting sta

51 Nov 29, 2022
👨‍💻 run nanosaur in simulation with Gazebo/Ingnition

🦕 👨‍💻 nanosaur_gazebo nanosaur The smallest NVIDIA Jetson dinosaur robot, open-source, fully 3D printable, based on ROS2 & Isaac ROS. Designed & ma

nanosaur 9 Jul 19, 2022
PRTR: Pose Recognition with Cascade Transformers

PRTR: Pose Recognition with Cascade Transformers Introduction This repository is the official implementation for Pose Recognition with Cascade Transfo

mlpc-ucsd 133 Dec 30, 2022
A different spin on dataclasses.

dataklasses Dataklasses is a library that allows you to quickly define data classes using Python type hints. Here's an example of how you use it: from

David Beazley 752 Nov 18, 2022
🙄 Difficult algorithm, Simple code.

🎉TensorFlow2.0-Examples🎉! "Talk is cheap, show me the code." ----- Linus Torvalds Created by YunYang1994 This tutorial was designed for easily divin

1.7k Dec 25, 2022
Differentiable Wavetable Synthesis

Differentiable Wavetable Synthesis

4 Feb 11, 2022
Node Editor Plug for Blender

NodeEditor Blender的程序化建模插件 Show Current 基本框架:自定义的tree-node-socket、tree中的node与socket采用字典查询、基于socket入度的拓扑排序 数据传递和处理依靠Tree中的字典,socket传递字典key TODO 增加更多的节点

Cuimi 11 Dec 03, 2022
Implementation of the "PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences" paper.

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences Introduction Point cloud sequences are irregular and unordered in the spatial dimen

Hehe Fan 63 Dec 09, 2022