Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Related tags

Deep LearningStarQE
Overview

Query Embedding on Hyper-Relational Knowledge Graphs

This repository contains the code used for the experiments in the paper

Query Embedding on Hyper-Relational Knowledge Graphs.
Dimitrios Alivanistos and Max Berrendorf and Michael Cochez and Mikhail Galkin

If you encounter any problems, or have suggestions on how to improve this code, open an issue.

Abstract: Multi-hop logical reasoning is an established problem in the field of representation learning on knowledge graphs (KGs). It subsumes both one-hop link prediction as well as other more complex types of logical queries. Existing algorithms operate only on classical, triple-based graphs, whereas modern KGs often employ a hyper-relational modeling paradigm. In this paradigm, typed edges may have several key-value pairs known as qualifiers that provide fine-grained context for facts. In queries, this context modifies the meaning of relations, and usually reduces the answer set. Hyper-relational queries are often observed in real-world KG applications, and existing approaches for approximate query answering cannot make use of qualifier pairs. In this work, we bridge this gap and extend the multi-hop reasoning problem to hyper-relational KGs allowing to tackle this new type of complex queries. Building upon recent advancements in Graph Neural Networks and query embedding techniques, we study how to embed and answer hyper-relational conjunctive queries. Besides that, we propose a method to answer such queries and demonstrate in our experiments that qualifiers improve query answering on a diverse set of query patterns.

Requirements

We developed our repository using Python 3.8.5. Other version may also work.

First, please ensure that you have properly installed

in your environment. Running experiments is possible on both CPU and GPU. On a GPU, the training should go noticeably faster. If you are using GPU, please make sure that the installed versions match your CUDA version.

We recommend the use of virtual environments, be it virtualenv or conda.

Now, clone the repository and install other dependencies using pip. After moving to the root of the repo (and with your virtual env activated) type:

pip install .

If you want to change code, we suggest to use the editable mode of the pip installation:

pip install -e .

To log results, we suggest using wandb. Instructions on installation and setting up can be found here: https://docs.wandb.ai/quickstart

Running test (optional)

You can run the tests by installing the test dependencies

pip install -e '.[test]'

and then executing them

pytest

Both from the root of the project.

It is normal that you see some skipped tests.

Running experiments

The easiest way to start experiments is via the command line interface. The command line also provides more information on the options available for each command. You can show the help it by typing

hqe --help

into a terminal within your active python environment. Some IDEs, e.g. PyCharm, require you to start from a file if you want to enable the debugger. To this end, we also provide a thin wrapper in executables, which you can start by

python executables/main.py

Downloading the data

To run experiments, we offer the preprocessed queries for download. It is also possible to run the preprocessing steps yourself, cf. the data preprocessing README, using the following command

hqe preprocess skip-and-download-binary

Training a model

There are many options are available for model training. For an overview of options, run

hqe train --help

Some examples:


Train with default settings, using 10000 reified 1hop queries with a qualifier and use 5000 reified triples from the validation set. Details on how to specify the amount of samples can be found in [src/mphrqe/data/loader.Sample](the Sample class). Note that the data loading is taking care of only using data from the correct data split.

hqe train \
    -tr /1hop/1qual:atmost10000:reify \
    -va /1hop/1qual:5000:reify

Train with the same data, but with custom parameters for the model. The example below uses target pooling to get the embedding of the query graph, uses a dropout of 0.5 in the layers, uses cosine similarity instead of the dot product to compute similarity when ranking answers to the query, and enables wandb for logging the metrics. Finally, the trained model is stored as a file training-example-model.pt which then be used in the evaluation.

hqe train \
    -tr /1hop/1qual:atmost10000:reify \
    -va /1hop/1qual:5000:reify \
    --graph-pooling TargetPooling \
    --dropout 0.5 \
    --similarity CosineSimilarity \
    --use-wandb --wandb-name "training-example" \
    --save \
    --model-path "training-example-model.pt"

By default, the model path is relative to the current working directory. Providing an absolute path to a different directory can change that.

Performing hyper parameter optimization

To find optimal parameters for a dataset, one can run a hyperparameter optimization. Under the hood this is using the optuna framework.

All options for the hyperparameter optimization can be seen with

hqe optimize --help

Some examples:


Run hyper-parameter optimization. This will result in a set of runs with different hyper-parameters from which the user can pick the best.

hqe optimize \
    -tr "/1hop/1qual-per-triple:*" \
    -tr "/2i/1qual-per-triple:atmost40000" \
    -tr "/2hop/1qual-per-triple:40000" \
    -tr "/3hop/1qual-per-triple:40000" \
    -tr "/3i/1qual-per-triple:40000" \
    -va "/1hop/1qual-per-triple:atmost3500" \
    -va "/2i/1qual-per-triple:atmost3500" \
    -va "/2hop/1qual-per-triple:atmost3500" \
    -va "/3hop/1qual-per-triple:atmost3500" \
    -va "/3i/1qual-per-triple:atmost3500" \
    --use-wandb \
    --wandb-name "hpo-query2box-style"

Evaluating model performance

To evaluate a model's performance on the test set, we provide an example below:

hqe evaluate \
    --test-data "/1hop/1qual:5000:reify" \
    --use-wandb \
    --wandb-name "test-example" \
    --model-path "training-example-model.pt"

Citation

If you find this work useful, please consider citing

@misc{alivanistos2021query,
      title={Query Embedding on Hyper-relational Knowledge Graphs}, 
      author={Dimitrios Alivanistos and Max Berrendorf and Michael Cochez and Mikhail Galkin},
      year={2021},
      eprint={2106.08166},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}
You might also like...
Code for the paper Learning the Predictability of the Future

Learning the Predictability of the Future Code from the paper Learning the Predictability of the Future. Website of the project in hyperfuture.cs.colu

PyTorch code for the paper: FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning
PyTorch code for the paper: FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning This is the PyTorch implementation of our paper: FeatMatch: Feature-Based Augmentat

Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation
Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

A Theoretical Analysis of the Repetition Problem in Text Generation This repository share the code for the paper "A Theoretical Analysis of the Repeti

Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks
Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

SA-Net: Shuffle Attention for Deep Convolutional Neural Networks (paper) By Qing-Long Zhang and Yu-Bin Yang [State Key Laboratory for Novel Software T

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Code for the Shortformer model, from the paper by Ofir Press, Noah A. Smith and Mike Lewis.

Shortformer This repository contains the code and the final checkpoint of the Shortformer model. This file explains how to run our experiments on the

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection
PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

Unbiased Teacher for Semi-Supervised Object Detection This is the PyTorch implementation of our paper: Unbiased Teacher for Semi-Supervised Object Detection

Official code for paper "Optimization for Oriented Object Detection via Representation Invariance Loss".

Optimization for Oriented Object Detection via Representation Invariance Loss By Qi Ming, Zhiqiang Zhou, Lingjuan Miao, Xue Yang, and Yunpeng Dong. Th

Code for our CVPR 2021 paper
Code for our CVPR 2021 paper "MetaCam+DSCE"

Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification (CVPR'21) Introduction Code for our CVPR 2021

Comments
  • bug in SPARQL for 1hop-2i/0qual

    bug in SPARQL for 1hop-2i/0qual

    It looks like the SPARQL is not executable. should line 37 in test/validation and line 22 in train: FILTER ((?s1 != ?o2_s0) || (?s1 = ?o2_s0 && str(?p0)< str(?1) )) be FILTER ((?s1 != ?o2_s0) || (?s1 = ?o2_s0 && str(?p0)< str(?p1) )) ?

    opened by Kelaproth 2
Releases(v1.0.0-iclr)
Owner
DimitrisAlivas
Researcher. Data scientist. Passionate about Tech & AI
DimitrisAlivas
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor

Keon Lee 279 Jan 04, 2023
Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

PixelSynth: Generating a 3D-Consistent Experience from a Single Image (ICCV 2021) Chris Rockwell, David F. Fouhey, and Justin Johnson [Project Website

Chris Rockwell 95 Nov 22, 2022
Invasive Plant Species Identification

Invasive_Plant_Species_Identification Used LiDAR Odometry and Mapping (LOAM) to create a 3D point cloud map which can be used to identify invasive pla

2 May 12, 2022
Deep Learning Slide Captcha

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 55 Jan 02, 2023
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

FastPitchFormant - PyTorch Implementation PyTorch Implementation of FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis. Qu

Keon Lee 63 Jan 02, 2023
PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA

Soft DTW Loss Function for PyTorch in CUDA This is a Pytorch Implementation of Soft-DTW: a Differentiable Loss Function for Time-Series which is batch

Keon Lee 76 Dec 20, 2022
Python parser for DTED data.

DTED Parser This is a package written in pure python (with help from numpy) to parse and investigate Digital Terrain Elevation Data (DTED) files. This

Ben Bonenfant 12 Dec 18, 2022
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

CSWin-Transformer This repo is the official implementation of "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows". Th

Microsoft 409 Jan 06, 2023
EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising By Tengfei Liang, Yi Jin, Yidong Li, Tao Wang. Th

workingcoder 115 Jan 05, 2023
Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization Official PyTorch implementation for our URST (Ultra-Resolution Sty

czczup 148 Dec 27, 2022
code for "Self-supervised edge features for improved Graph Neural Network training",

Self-supervised edge features for improved Graph Neural Network training Data availability: Here is a link to the raw data for the organoids dataset.

Neal Ravindra 23 Dec 02, 2022
Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" (NeurIPS'20)

IGNN Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" [paper] [supp] Prepare datasets 1 Download training dataset

Shangchen Zhou 278 Jan 03, 2023
[NeurIPS 2021] The PyTorch implementation of paper "Self-Supervised Learning Disentangled Group Representation as Feature"

IP-IRM [NeurIPS 2021] The PyTorch implementation of paper "Self-Supervised Learning Disentangled Group Representation as Feature". Codes will be relea

Wang Tan 67 Dec 24, 2022
Package for extracting emotions from social media text. Tailored for financial data.

EmTract: Extracting Emotions from Social Media Text Tailored for Financial Contexts EmTract is a tool that extracts emotions from social media text. I

13 Nov 17, 2022
Reproducing Results from A Hybrid Approach to Targeting Social Assistance

title author date output Reproducing Results from A Hybrid Approach to Targeting Social Assistance Lendie Follett and Heath Henderson 12/28/2021 html_

Lendie Follett 0 Jan 06, 2022
The official implementation for "FQ-ViT: Fully Quantized Vision Transformer without Retraining".

FQ-ViT [arXiv] This repo contains the official implementation of "FQ-ViT: Fully Quantized Vision Transformer without Retraining". Table of Contents In

132 Jan 08, 2023
An official implementation of "Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation" (CVPR 2021) in PyTorch.

BANA This is the implementation of the paper "Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation". For more inf

CV Lab @ Yonsei University 59 Dec 12, 2022
Model of an AI powered sign language interpreter.

TEXT AND SPEECH TO SIGN LANGUAGE. A web application which takes in text or live audio speech recording as input, converts and displays the relevant Si

Mark Gatere 4 Mar 30, 2022
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Differentiable Model Compression via Pseudo Quantization Noise DiffQ performs differentiable quantization using pseudo quantization noise. It can auto

Facebook Research 145 Dec 30, 2022
This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

PyTorch Infer Utils This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model infer

Alex Gorodnitskiy 11 Mar 20, 2022