PyGCL: Graph Contrastive Learning Library for PyTorch

Last update: Jan 07, 2023

Overview

PyGCL: Graph Contrastive Learning for PyTorch

PyGCL is an open-source library for graph contrastive learning (GCL), which features modularized GCL components from published papers, standardized evaluation, and experiment management.

Prerequisites

PyGCL needs the following packages to be installed beforehand:

Python 3.8+
PyTorch 1.7+
PyTorch-Geometric 1.7
DGL 0.5+
Scikit-learn 0.24+

Getting Started

Take a look at various examples located at the root directory. For example, try the following command to train a simple GCN for node classification on the WikiCS dataset using the local-local contrasting mode:

python train_node_l2l.py --dataset WikiCS --param_path params/GRACE/[email protected] --base_model GCNConv

For detailed parameter settings, please refer to [email protected]. These examples are mainly for reproducing experiments in our benchmarking study. You can find more details regarding general practices of graph contrastive learning in the paper.

Usage

Package Overview

Our PyGCL implements four main components of graph contrastive learning algorithms:

graph augmentation: transforms input graphs into congruent graph views.
contrasting modes: specifies positive and negative pairs.
contrastive objectives: computes the likelihood score for positive and negative pairs.
negative mining strategies: improves the negative sample set by considering the relative similarity (the hardness) of negative sample.

We also implement utilities for loading datasets, training models, and running experiments.

Building Your Own GCL Algorithms

Besides try the above examples for node and graph classification tasks, you can also build your own graph contrastive learning algorithms straightforwardly.

Graph Augmentation

In GCL.augmentors, PyGCL provides the Augmentor base class, which offers a universal interface for graph augmentation functions. Specifically, PyGCL implements the following augmentation functions:

Augmentation	Class name
Edge Adding (EA)	`EdgeAdding`
Edge Removing (ER)	`EdgeRemoving`
Feature Masking (FM)	`FeatureMasking`
Feature Dropout (FD)	`FeatureDropout`
Personalized PageRank (PPR)	`PPRDiffusion`
Markov Diffusion Kernel (MDK)	`MarkovDiffusion`
Node Dropping (ND)	`NodeDropping`
Subgraphs induced by Random Walks (RWS)	`RWSampling`
Ego-net Sampling (ES)	`Identity`

Call these augmentation functions by feeding with a graph of in a tuple form of node features, edge index, and edge features x, edge_index, edge_weightswill produce corresponding augmented graphs.

PyGCL also supports composing arbitrary number of augmentations together. To compose a list of augmentation instances augmentors, you only need to use the right shift operator >>:

aug = augmentors[0]
for a in augs[1:]:
    aug = aug >> a

You can also write your own augmentation functions by defining the augment function.

Contrasting Modes

PyGCL implements three contrasting modes: (a) local-local, (b) global-local, and (c) global-global modes. You can refer to the models folder for details. Note that the bootstrapping latent loss involves some special model design (asymmetric online/offline encoders and momentum weight updates) and thus we implement contrasting modes involving this contrastive objective in a separate BGRL model.

Contrastive Objectives

In GCL.losses, PyGCL implements the following contrastive objectives:

Contrastive objectives	Class name
InfoNCE loss	`InfoNCELoss`
Jensen-Shannon Divergence (JSD) loss	`JSDLoss`
Triplet Margin (TM) loss	`TripletLoss`
Bootstrapping Latent (BL) loss	`BootstrapLoss`
Barlow Twins (BT) loss	`BTLoss`
VICReg loss	`VICRegLoss`

All these objectives are for contrasting positive and negative pairs at the same scale (i.e. local-local and global-global modes). For global-local modes, we offer G2L variants except for Barlow Twins and VICReg losses. Moreover, for InfoNCE, JSD, and Triplet losses, we further provide G2LEN variants, primarily for node-level tasks, which involve explicit construction of negative samples. You can find their examples in the root folder.

Negative Mining Strategies

In GCL.losses, PyGCL further implements four negative mining strategies that are build upon the InfoNCE contrastive objective:

Hard negative mining strategies	Class name
Hard negative mixing	`HardMixingLoss`
Conditional negative sampling	`RingLoss`
Debiased contrastive objective	`InfoNCELoss(debiased_nt_xent_loss)`
Hardness-biased negative sampling	`InfoNCELoss(hardness_nt_xent_loss)`

Utilities

PyGCL provides various utilities for data loading, model training, and experiment execution.

In GCL.util you can use the following utilities:

split_dataset: splits the dataset into train/test/validation sets according to public or random splits. Currently, four split modes are supported: [rand, ogb, wikics, preload] .
seed_everything: manually sets the seed to numpy and PyTorch environments to ensure better reproducebility.
SimpleParam: provides a simple parameter configuration class to manage parameters from microsoft-nni, JSON, and YAML files.

We also implement two downstream classifiersLR_classification and SVM_classification in GCL.eval based on PyTorch and Scikit-learn respectively.

Moreover, based on PyTorch Geometric, we provide functions for loading common node and graph datasets. You can useload_node_dataset and load_graph_dataset in utils.py.

PyGCL: Graph Contrastive Learning Library for PyTorch

Related tags

Overview

PyGCL: Graph Contrastive Learning for PyTorch

Prerequisites

Getting Started

Usage

Package Overview

Building Your Own GCL Algorithms

Graph Augmentation

Contrasting Modes

Contrastive Objectives

Negative Mining Strategies

Utilities

Owner

GCL: Graph Contrastive Learning Library for PyTorch

Code for paper "Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking"

TorchShard is a lightweight engine for slicing a PyTorch tensor into parallel shards

Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

This is an differentiable pytorch implementation of SIFT patch descriptor.

A pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch.

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

PyTorch Lightning Optical Flow models, scripts, and pretrained weights.

Bunch of optimizer implementations in PyTorch

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

PyTorch wrappers for using your model in audacity!

Reformer, the efficient Transformer, in Pytorch

An optimizer that trains as fast as Adam and as good as SGD.

Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Pretrained EfficientNet, EfficientNet-Lite, MixNet, MobileNetV3 / V2, MNASNet A1 and B1, FBNet, Single-Path NAS

Over9000 optimizer

Tutorial for surrogate gradient learning in spiking neural networks

PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf

High-level batteries-included neural network training library for Pytorch

Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

A PyTorch implementation of L-BFGS.