Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Overview

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs

Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs. Advances in Neural Information Processing Systems 33 (2020).

[Paper] [Poster] [Slides]

Requirements

Basic Requirements

  • Python >= 3.7 (tested on 3.8)

  • signac: this package utilizes signac to manage experiment data and jobs. signac can be installed with the following command:

    pip install signac==1.1 signac-flow==0.7.1 signac-dashboard

    Note that the latest version of signac may cause incompatibility.

  • numpy (tested on 1.18.5)

  • scipy (tested on 1.5.0)

  • networkx >= 2.4 (tested on 2.4)

  • scikit-learn (tested on 0.23.2)

For H2GCN

  • TensorFlow >= 2.0 (tested on 2.2)

Note that it is possible to use H2GCN without signac and scikit-learn on your own data and experimental framework.

For baselines

We also include the code for the baseline methods in the repository. These code are mostly the same as the reference implementations provided by the authors, with our modifications to add JK-connections, interoperability with our experimental pipeline, etc. For the requirements to run these baselines, please refer to the instructions provided by the original authors of the corresponding code, which could be found in each folder under /baselines.

As a general note, TensorFlow 1.15 can be used for all code requiring TensorFlow 1.x; for PyTorch, it is usually fine to use PyTorch 1.6; all code should be able to run under Python >= 3.7. In addition, the basic requirements must also be met.

Usage

Download Datasets

The datasets can be downloaded using the bash scripts provided in /experiments/h2gcn/scripts, which also prepare the datasets for use in our experimental framework based on signac.

We make use of signac to index and manage the datasets: the datasets and experiments are stored in hierarchically organized signac jobs, with the 1st level storing different graphs, 2nd level storing different sets of features, and 3rd level storing different training-validation-test splits. Each level contains its own state points and job documents to differentiate with other jobs.

Use signac schema to list all available properties in graph state points; use signac find to filter graphs using properties in the state points:

cd experiments/h2gcn/

# List available properties in graph state points
signac schema

# Find graphs in syn-products with homophily level h=0.1
signac find numNode 10000 h 0.1

# Find real benchmark "Cora"
signac find benchmark true datasetName\.\$regex "cora"

/experiments/h2gcn/utils/signac_tools.py provides helpful functions to iterate through the data space in Python; more usages of signac can be found in these documents.

Replicate Experiments with signac

  • To replicate our experiments of each model on specific datasets, use Python scripts in /experiments/h2gcn, and the corresponding JSON config files in /experiments/h2gcn/configs. For example, to run H2GCN on our synthetic benchmarks syn-cora:

    cd experiments/h2gcn/
    python run_hgcn_experiments.py -c configs/syn-cora/h2gcn.json [-i] run [-p PARALLEL_NUM]
    • Files and results generated in experiments are also stored with signac on top of the hierarchical order introduced above: the 4th level separates different models, and the 5th level stores files and results generated in different runs with different parameters of the same model.

    • By default, stdout and stderr of each model are stored in terminal_output.log in the 4th level; use -i if you want to see them through your terminal.

    • Use -p if you want to run experiments in parallel on multiple graphs (1st level).

    • Baseline models can be run through the following scripts:

      • GCN, GCN-Cheby, GCN+JK and GCN-Cheby+JK: run_gcn_experiments.py
      • GraphSAGE, GraphSAGE+JK: run_graphsage_experiments.py
      • MixHop: run_mixhop_experiments.py
      • GAT: run_gat_experiments.py
      • MLP: run_hgcn_experiments.py
  • To summarize experiment results of each model on specific datasets to a CSV file, use Python script /experiments/h2gcn/run_experiments_summarization.py with the corresponding model name and config file. For example, to summarize H2GCN results on our synthetic benchmark syn-cora:

    cd experiments/h2gcn/
    python run_experiments_summarization.py h2gcn -f configs/syn-cora/h2gcn.json
  • To list all paths of the 3rd level datasets splits used in a experiment (in planetoid format) without running experiments, use the following command:

    cd experiments/h2gcn/
    python run_hgcn_experiments.py -c configs/syn-cora/h2gcn.json --check_paths run

Standalone H2GCN Package

Our implementation of H2GCN is stored in the h2gcn folder, which can be used as a standalone package on your own data and experimental framework.

Example usages:

  • H2GCN-2

    cd h2gcn
    python run_experiments.py H2GCN planetoid \
      --dataset ind.citeseer \
      --dataset_path ../baselines/gcn/gcn/data/
  • H2GCN-1

    cd h2gcn
    python run_experiments.py H2GCN planetoid \
      --network_setup M64-R-T1-G-V-C1-D0.5-MO \
      --dataset ind.citeseer \
      --dataset_path ../baselines/gcn/gcn/data/
  • Use --help for more advanced usages:

    python run_experiments.py H2GCN planetoid --help

We only support datasets stored in planetoid format. You could also add support to different data formats and models beyond H2GCN by adding your own modules to /h2gcn/datasets and /h2gcn/models, respectively; check out ou code for more details.

Contact

Please contact Jiong Zhu ([email protected]) in case you have any questions.

Citation

Please cite our paper if you make use of this code in your own work:

@article{zhu2020beyond,
  title={Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs},
  author={Zhu, Jiong and Yan, Yujun and Zhao, Lingxiao and Heimann, Mark and Akoglu, Leman and Koutra, Danai},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}
Owner
GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan
Code repository for work by the GEMS Lab: https://gemslab.github.io/research/
GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan
A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

196 Jan 05, 2023
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Non-autoregressive Deep Learning-Based TTS Template This is a template for the Non-autoregressive TTS model. It contains Data Preprocessing Pipeline D

Keon Lee 13 Dec 05, 2022
Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

Self-Supervised Policy Adaptation during Deployment PyTorch implementation of PAD and evaluation benchmarks from Self-Supervised Policy Adaptation dur

Nicklas Hansen 101 Nov 01, 2022
Algorithmic Trading using RNN

Deep-Trading This an implementation adapted from Rachnog Neural networks for algorithmic trading. Part One — Simple time series forecasting and this c

Hazem Nomer 29 Sep 04, 2022
Resources for our AAAI 2022 paper: "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification".

LOREN Resources for our AAAI 2022 paper (pre-print): "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification". DEMO System Check out o

Jiangjie Chen 37 Dec 27, 2022
A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

張致強 14 Dec 02, 2022
Method for facial emotion recognition compitition of Xunfei and Datawhale .

人脸情绪识别挑战赛-第3名-W03KFgNOc-源代码、模型以及说明文档 队名:W03KFgNOc 排名:3 正确率: 0.75564 队员:yyMoming,xkwang,RichardoMu。 比赛链接:人脸情绪识别挑战赛 文章地址:link emotion 该项目分别训练八个模型并生成csv文

6 Oct 17, 2022
2D Time independent Schrodinger equation solver for arbitrary shape of well

Schrodinger Well Python Python solver for timeless Schrodinger equation for well with arbitrary shape https://imgur.com/a/jlhK7OZ Pictures of circular

WeightAn 24 Nov 18, 2022
Raindrop strategy for Irregular time series

Graph-Guided Network For Irregularly Sampled Multivariate Time Series Overview This repository contains processed datasets and implementation code for

Zitnik Lab @ Harvard 74 Jan 03, 2023
Dilated Convolution with Learnable Spacings PyTorch

Dilated-Convolution-with-Learnable-Spacings-PyTorch Ismail Khalfaoui Hassani Dilated Convolution with Learnable Spacings (abbreviated to DCLS) is a no

15 Dec 09, 2022
Deep Sketch-guided Cartoon Video Inbetweening

Cartoon Video Inbetweening Paper | DOI | Video The source code of Deep Sketch-guided Cartoon Video Inbetweening by Xiaoyu Li, Bo Zhang, Jing Liao, Ped

Xiaoyu Li 37 Dec 22, 2022
使用深度学习框架提取视频硬字幕;docker容器免安装深度学习库,使用本地api接口使得界面和后端识别分离;

extract-video-subtittle 使用深度学习框架提取视频硬字幕; 本地识别无需联网; CPU识别速度可观; 容器提供API接口; 运行环境 本项目运行环境非常好搭建,我做好了docker容器免安装各种深度学习包; 提供windows界面操作; 容器为CPU版本; 视频演示 https

歌者 16 Aug 06, 2022
This program can detect your face and add an Christams hat on the top of your head

Auto_Christmas This program can detect your face and add a Christmas hat to the top of your head. just run the Auto_Christmas.py, then you can see the

3 Dec 22, 2021
The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

Interscript The Interscript dataset contains interactive user feedback on a T5-11B model generated scripts. Dataset data.json contains the data in an

AI2 8 Dec 01, 2022
An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

关于实现的一点说明 山东大学 2020级 苏博南 www.subonan.com 文件说明 tools.py 这里面主要有两个函数: resize(a, lenb) 这其实是我找同学写的一个小算法hhh。给出一个$28\times 28$的方阵a,返回一个$lenb\times lenb$的方阵。因

ぼっけなす 2 Aug 29, 2022
This repository introduces a short project about Transfer Learning for Classification of MRI Images.

Transfer Learning for MRI Images Classification This repository introduces a short project made during my stay at Neuromatch Summer School 2021. This

Oscar Guarnizo 3 Nov 15, 2022
MassiveSumm: a very large-scale, very multilingual, news summarisation dataset

MassiveSumm: a very large-scale, very multilingual, news summarisation dataset This repository contains links to data and code to fetch and reproduce

Daniel Varab 19 Dec 16, 2022
Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

IrwGAN (ICCV2021) Unaligned Image-to-Image Translation by Learning to Reweight [Update] 12/15/2021 All dataset are released, trained models and genera

37 Nov 09, 2022
Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Single Optical Path

Keyhole Imaging Code & Dataset Code associated with the paper "Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Singl

Stanford Computational Imaging Lab 20 Feb 03, 2022
TimeSHAP explains Recurrent Neural Network predictions.

TimeSHAP TimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and extends it to the sequential domain. TimeSHAP computes even

Feedzai 90 Dec 18, 2022