Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Last update: Dec 18, 2022

Overview

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs

Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs. Advances in Neural Information Processing Systems 33 (2020).

[Paper] [Poster] [Slides]

Requirements

Basic Requirements

Python >= 3.7 (tested on 3.8)
signac: this package utilizes signac to manage experiment data and jobs. signac can be installed with the following command:
```
pip install signac==1.1 signac-flow==0.7.1 signac-dashboard
```
Note that the latest version of signac may cause incompatibility.
numpy (tested on 1.18.5)
scipy (tested on 1.5.0)
networkx >= 2.4 (tested on 2.4)
scikit-learn (tested on 0.23.2)

For `H2GCN`

TensorFlow >= 2.0 (tested on 2.2)

Note that it is possible to use H2GCN without signac and scikit-learn on your own data and experimental framework.

For baselines

We also include the code for the baseline methods in the repository. These code are mostly the same as the reference implementations provided by the authors, with our modifications to add JK-connections, interoperability with our experimental pipeline, etc. For the requirements to run these baselines, please refer to the instructions provided by the original authors of the corresponding code, which could be found in each folder under /baselines.

As a general note, TensorFlow 1.15 can be used for all code requiring TensorFlow 1.x; for PyTorch, it is usually fine to use PyTorch 1.6; all code should be able to run under Python >= 3.7. In addition, the basic requirements must also be met.

Usage

Download Datasets

The datasets can be downloaded using the bash scripts provided in /experiments/h2gcn/scripts, which also prepare the datasets for use in our experimental framework based on signac.

We make use of signac to index and manage the datasets: the datasets and experiments are stored in hierarchically organized signac jobs, with the 1st level storing different graphs, 2nd level storing different sets of features, and 3rd level storing different training-validation-test splits. Each level contains its own state points and job documents to differentiate with other jobs.

Use signac schema to list all available properties in graph state points; use signac find to filter graphs using properties in the state points:

cd experiments/h2gcn/

# List available properties in graph state points
signac schema

# Find graphs in syn-products with homophily level h=0.1
signac find numNode 10000 h 0.1

# Find real benchmark "Cora"
signac find benchmark true datasetName\.\$regex "cora"

/experiments/h2gcn/utils/signac_tools.py provides helpful functions to iterate through the data space in Python; more usages of signac can be found in these documents.

Replicate Experiments with `signac`

To replicate our experiments of each model on specific datasets, use Python scripts in /experiments/h2gcn, and the corresponding JSON config files in /experiments/h2gcn/configs. For example, to run H2GCN on our synthetic benchmarks syn-cora:
```
cd experiments/h2gcn/
python run_hgcn_experiments.py -c configs/syn-cora/h2gcn.json [-i] run [-p PARALLEL_NUM]
```
- Files and results generated in experiments are also stored with signac on top of the hierarchical order introduced above: the 4th level separates different models, and the 5th level stores files and results generated in different runs with different parameters of the same model.
- By default, stdout and stderr of each model are stored in terminal_output.log in the 4th level; use -i if you want to see them through your terminal.
- Use -p if you want to run experiments in parallel on multiple graphs (1st level).
- Baseline models can be run through the following scripts:
  - GCN, GCN-Cheby, GCN+JK and GCN-Cheby+JK: run_gcn_experiments.py
  - GraphSAGE, GraphSAGE+JK: run_graphsage_experiments.py
  - MixHop: run_mixhop_experiments.py
  - GAT: run_gat_experiments.py
  - MLP: run_hgcn_experiments.py
To summarize experiment results of each model on specific datasets to a CSV file, use Python script /experiments/h2gcn/run_experiments_summarization.py with the corresponding model name and config file. For example, to summarize H2GCN results on our synthetic benchmark syn-cora:
```
cd experiments/h2gcn/
python run_experiments_summarization.py h2gcn -f configs/syn-cora/h2gcn.json
```
To list all paths of the 3rd level datasets splits used in a experiment (in planetoid format) without running experiments, use the following command:
```
cd experiments/h2gcn/
python run_hgcn_experiments.py -c configs/syn-cora/h2gcn.json --check_paths run
```

Standalone H2GCN Package

Our implementation of H2GCN is stored in the h2gcn folder, which can be used as a standalone package on your own data and experimental framework.

Example usages:

H2GCN-2

cd h2gcn
python run_experiments.py H2GCN planetoid \
  --dataset ind.citeseer \
  --dataset_path ../baselines/gcn/gcn/data/

H2GCN-1

cd h2gcn
python run_experiments.py H2GCN planetoid \
  --network_setup M64-R-T1-G-V-C1-D0.5-MO \
  --dataset ind.citeseer \
  --dataset_path ../baselines/gcn/gcn/data/

Use --help for more advanced usages:

python run_experiments.py H2GCN planetoid --help

We only support datasets stored in planetoid format. You could also add support to different data formats and models beyond H2GCN by adding your own modules to /h2gcn/datasets and /h2gcn/models, respectively; check out ou code for more details.

Contact

Please contact Jiong Zhu ([email protected]) in case you have any questions.

Citation

Please cite our paper if you make use of this code in your own work:

@article{zhu2020beyond,
  title={Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs},
  author={Zhu, Jiong and Yan, Yujun and Zhao, Lingxiao and Heimann, Mark and Akoglu, Leman and Koutra, Danai},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}

Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Related tags

Overview

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs

Requirements

Basic Requirements

For `H2GCN`

For baselines

Usage

Download Datasets

Replicate Experiments with `signac`

Standalone H2GCN Package

Contact

Citation

Owner

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan

League of Legends Reinforcement Learning Environment (LoLRLE) multiple training scenarios using PPO.

The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

Object Detection using YOLO from PyImageSearch

🏅 Top 5% in 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지

Council-GAN - Implementation for our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020)

A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Distributional Sliced-Wasserstein distance code

Implements MLP-Mixer: An all-MLP Architecture for Vision.

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

A Simplied Framework of GAN Inversion

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML)

Conformer: Local Features Coupling Global Representations for Visual Recognition

Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Related tags

Overview

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs

Requirements

Basic Requirements

For H2GCN

For baselines

Usage

Download Datasets

Replicate Experiments with signac

Standalone H2GCN Package

Contact

Citation

Owner

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan

League of Legends Reinforcement Learning Environment (LoLRLE) multiple training scenarios using PPO.

The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

Calculates JMA (Japan Meteorological Agency) seismic intensity (shindo) scale from acceleration data recorded in NumPy array

Object Detection using YOLO from PyImageSearch

🏅 Top 5% in 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지

Council-GAN - Implementation for our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020)

A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Distributional Sliced-Wasserstein distance code

Implements MLP-Mixer: An all-MLP Architecture for Vision.

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Allows including an action inside another action (by preprocessing the Yaml file). This is how composite actions should have worked.

git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

A Simplied Framework of GAN Inversion

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML)

Conformer: Local Features Coupling Global Representations for Visual Recognition

Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

For `H2GCN`

Replicate Experiments with `signac`