GNEE - GAT Neural Event Embeddings

Related tags

Deep LearningGNEE
Overview

GNEE - GAT Neural Event Embeddings

This repository contains source code for the GNEE (GAT Neural Event Embeddings) method introduced in the paper: "Semi-Supervised Graph Attention Networks for Event Representation Learning".

Abstract: Event analysis from news and social networks is very useful for a wide range of social studies and real-world applications. Recently, event graphs have been explored to represent event datasets and their complex relationships, where events are vertices connected to other vertices that represent locations, people's names, dates, and various other event metadata. Graph representation learning methods are promising for extracting latent features from event graphs to enable the use of different classification algorithms. However, existing methods fail to meet important requirements for event graphs, such as (i) dealing with semi-supervised graph embedding to take advantage of some labeled events, (ii) automatically determining the importance of the relationships between event vertices and their metadata vertices, as well as (iii) dealing with the graph heterogeneity. In this paper, we present GNEE (GAT Neural Event Embeddings), a method that combines Graph Attention Networks and Graph Regularization. First, an event graph regularization is proposed to ensure that all graph vertices receive event features, thereby mitigating the graph heterogeneity drawback. Second, semi-supervised graph embedding with self-attention mechanism considers existing labeled events, as well as learns the importance of relationships in the event graph during the representation learning process. A statistical analysis of experimental results with five real-world event graphs and six graph embedding methods shows that GNEE obtains state-of-the-art results.

File Structure

Our method consists of a BERT text encoding and a pre-processment procedure followed by modified version of GAT (Veličković et. al - 2017, https://arxiv.org/abs/1710.10903) to the event embedding task.

In our work, we adopt and modify the PyTorch implementation of GAT, pyGAT, developed by Diego999.

.
├── datasets_runs/ -> Datasets used
├── event_graph_utils.py -> Useful functions when working with event datasets
├── layers.py -> Implementation of Graph Attention layers
├── LICENSE
├── main.py -> Execute this script to reproduce our experiments (refer to our paper for more details)
├── models.py -> Implementation of the original GAT model
├── notebooks -> Run these notebooks to reproduce all our experiments.
├── README.md
├── requirements.txt
├── train.py -> Implementation of our preprocessing, traning and testing pipelines
└── utils.py -> Useful functions used in GAT original implementation.

Reproducibility Notebooks

./notebooks
├── DeepWalk_Event_Embeddings.ipynb -> DeepWalk Benchmark
├── GAT_Event_Embeddings_+_Without_Regularization.ipynb -> GAT w/o embeddings benchmark
├── GCN_Event_Embeddings_.ipynb -> GCN Benchmark
├── GNEE_Attention_Matrices_Example.ipynb -> GNEE Attention matrices visualization
├── GNEE_Embedding_Visualization_t_SNE.ipynb -> GNEE Embeddings visualization using t-SNE
├── GNEE.ipynb -> GNEE Benchmark
├── Label_Propagation_Event_Classification.ipynb -> LP Benchmark
├── LINE_Event_Embeddings.ipynb -> LINE Benchmark
├── Node2Vec_Event_Embeddings.ipynb -> Node2Vec Benchmark
├── SDNE_Event_Embeddings.ipynb -> SDNE Benchmark
└── Struct2Vec_Event_Embeddings.ipynb -> Struct2Vec Benchmark

Hardware requirements

When running on "dense" mode (no --sparse flag), our model uses about 18 GB on GRAM. On the other hand, the sparse mode (using --sparse) uses less than 1.5 GB on GRAM, which is an ideal setup to environments such as Google Colab.

Issues/Pull Requests/Feedbacks

Please, contact the authors in case of issues / pull requests / feedbacks :)

Owner
João Pedro Rodrigues Mattos
Undergraduate Research Assistant, sponsored by FAPESP - Machine Learning | Web Development | Human Computer Interface
João Pedro Rodrigues Mattos
This repository consists of Blender python scripts and corresponding assets to generate variants of the CANDLE dataset

candle-simulator This repository consists of Blender python scripts and corresponding assets to generate variants of the IITH-CANDLE dataset. The rend

1 Dec 15, 2021
Light-weight network, depth estimation, knowledge distillation, real-time depth estimation, auxiliary data.

light-weight-depth-estimation Boosting Light-Weight Depth Estimation Via Knowledge Distillation, https://arxiv.org/abs/2105.06143 Junjie Hu, Chenyou F

Junjie Hu 13 Dec 10, 2022
A modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model)

ParallelFold Author: Bozitao Zhong This is a modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (p

Bozitao Zhong 77 Dec 22, 2022
CS583: Deep Learning

CS583: Deep Learning

Shusen Wang 2.6k Dec 30, 2022
Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021) An efficient PyTorch library for Point Cloud Completion.

Microsoft 119 Jan 02, 2023
Anonymize BLM Protest Images

Anonymize BLM Protest Images This repository automates @BLMPrivacyBot, a Twitter bot that shows the anonymized images to help keep protesters safe. Us

Stanford Machine Learning Group 40 Oct 13, 2022
Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.

DistilBERT-Text-mining-authorship-attribution Dataset used: https://www.kaggle.com/azimulh/tweets-data-for-authorship-attribution-modelling/version/2

1 Jan 13, 2022
Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

CyGNet This repository reproduces the AAAI'21 paper “Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Network

CunchaoZ 89 Jan 03, 2023
Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch

[AAAI 2021]DropLoss for Long-Tail Instance Segmentation [AAAI 2021] DropLoss for Long-Tail Instance Segmentation Ting-I Hsieh*, Esther Robb*, Hwann-Tz

Tim 37 Dec 02, 2022
MVP Benchmark for Multi-View Partial Point Cloud Completion and Registration

MVP Benchmark: Multi-View Partial Point Clouds for Completion and Registration [NEWS] 2021-07-12 [NEW 🎉 ] The submission on Codalab starts! 2021-07-1

PL 93 Dec 21, 2022
Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Collapse by Conditioning: Training Class-conditional GANs with Limited Data Moha

Mohamad Shahbazi 33 Dec 06, 2022
Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

Relation Prediction as an Auxiliary Training Objective for Knowledge Base Completion This repo provides the code for the paper Relation Prediction as

Facebook Research 85 Jan 02, 2023
This repository includes different versions of the prescribed-time controller as Simulink blocks and MATLAB script codes for engineering applications.

Prescribed-time Control Prescribed-time control (PTC) blocks in Simulink environment, MATLAB R2020b. For more theoretical details, refer to the papers

Amir Shakouri 1 Mar 11, 2022
Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras

Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras This tutorial shows how to use Keras library to build deep ne

Marko Jocić 922 Dec 19, 2022
A3C LSTM Atari with Pytorch plus A3G design

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C

David Griffis 532 Jan 02, 2023
An NVDA add-on to split screen reader and audio from other programs to different sound channels

An NVDA add-on to split screen reader and audio from other programs to different sound channels (add-on idea credit: Tony Malykh)

Joseph Lee 7 Dec 25, 2022
This project aims at providing a concise, easy-to-use, modifiable reference implementation for semantic segmentation models using PyTorch.

Semantic Segmentation on PyTorch (include FCN, PSPNet, Deeplabv3, Deeplabv3+, DANet, DenseASPP, BiSeNet, EncNet, DUNet, ICNet, ENet, OCNet, CCNet, PSANet, CGNet, ESPNet, LEDNet, DFANet)

2.4k Jan 08, 2023
wlad 2 Dec 19, 2022
Open source annotation tool for machine learning practitioners.

doccano doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequ

7.1k Jan 01, 2023
bespoke tooling for offensive security's Windows Usermode Exploit Dev course (OSED)

osed-scripts bespoke tooling for offensive security's Windows Usermode Exploit Dev course (OSED) Table of Contents Standalone Scripts egghunter.py fin

epi 268 Jan 05, 2023