PyGCL: Graph Contrastive Learning Library for PyTorch

Overview

PyGCL: Graph Contrastive Learning for PyTorch

PyGCL is an open-source library for graph contrastive learning (GCL), which features modularized GCL components from published papers, standardized evaluation, and experiment management.


Prerequisites

PyGCL needs the following packages to be installed beforehand:

  • Python 3.8+
  • PyTorch 1.7+
  • PyTorch-Geometric 1.7
  • DGL 0.5+
  • Scikit-learn 0.24+

Getting Started

Take a look at various examples located at the root directory. For example, try the following command to train a simple GCN for node classification on the WikiCS dataset using the local-local contrasting mode:

python train_node_l2l.py --dataset WikiCS --param_path params/GRACE/[email protected] --base_model GCNConv

For detailed parameter settings, please refer to [email protected]. These examples are mainly for reproducing experiments in our benchmarking study. You can find more details regarding general practices of graph contrastive learning in the paper.

Usage

Package Overview

Our PyGCL implements four main components of graph contrastive learning algorithms:

  • graph augmentation: transforms input graphs into congruent graph views.
  • contrasting modes: specifies positive and negative pairs.
  • contrastive objectives: computes the likelihood score for positive and negative pairs.
  • negative mining strategies: improves the negative sample set by considering the relative similarity (the hardness) of negative sample.

We also implement utilities for loading datasets, training models, and running experiments.

Building Your Own GCL Algorithms

Besides try the above examples for node and graph classification tasks, you can also build your own graph contrastive learning algorithms straightforwardly.

Graph Augmentation

In GCL.augmentors, PyGCL provides the Augmentor base class, which offers a universal interface for graph augmentation functions. Specifically, PyGCL implements the following augmentation functions:

Augmentation Class name
Edge Adding (EA) EdgeAdding
Edge Removing (ER) EdgeRemoving
Feature Masking (FM) FeatureMasking
Feature Dropout (FD) FeatureDropout
Personalized PageRank (PPR) PPRDiffusion
Markov Diffusion Kernel (MDK) MarkovDiffusion
Node Dropping (ND) NodeDropping
Subgraphs induced by Random Walks (RWS) RWSampling
Ego-net Sampling (ES) Identity

Call these augmentation functions by feeding with a graph of in a tuple form of node features, edge index, and edge features x, edge_index, edge_weightswill produce corresponding augmented graphs.

PyGCL also supports composing arbitrary number of augmentations together. To compose a list of augmentation instances augmentors, you only need to use the right shift operator >>:

aug = augmentors[0]
for a in augs[1:]:
    aug = aug >> a

You can also write your own augmentation functions by defining the augment function.

Contrasting Modes

PyGCL implements three contrasting modes: (a) local-local, (b) global-local, and (c) global-global modes. You can refer to the models folder for details. Note that the bootstrapping latent loss involves some special model design (asymmetric online/offline encoders and momentum weight updates) and thus we implement contrasting modes involving this contrastive objective in a separate BGRL model.

Contrastive Objectives

In GCL.losses, PyGCL implements the following contrastive objectives:

Contrastive objectives Class name
InfoNCE loss InfoNCELoss
Jensen-Shannon Divergence (JSD) loss JSDLoss
Triplet Margin (TM) loss TripletLoss
Bootstrapping Latent (BL) loss BootstrapLoss
Barlow Twins (BT) loss BTLoss
VICReg loss VICRegLoss

All these objectives are for contrasting positive and negative pairs at the same scale (i.e. local-local and global-global modes). For global-local modes, we offer G2L variants except for Barlow Twins and VICReg losses. Moreover, for InfoNCE, JSD, and Triplet losses, we further provide G2LEN variants, primarily for node-level tasks, which involve explicit construction of negative samples. You can find their examples in the root folder.

Negative Mining Strategies

In GCL.losses, PyGCL further implements four negative mining strategies that are build upon the InfoNCE contrastive objective:

Hard negative mining strategies Class name
Hard negative mixing HardMixingLoss
Conditional negative sampling RingLoss
Debiased contrastive objective InfoNCELoss(debiased_nt_xent_loss)
Hardness-biased negative sampling InfoNCELoss(hardness_nt_xent_loss)

Utilities

PyGCL provides various utilities for data loading, model training, and experiment execution.

In GCL.util you can use the following utilities:

  • split_dataset: splits the dataset into train/test/validation sets according to public or random splits. Currently, four split modes are supported: [rand, ogb, wikics, preload] .
  • seed_everything: manually sets the seed to numpy and PyTorch environments to ensure better reproducebility.
  • SimpleParam: provides a simple parameter configuration class to manage parameters from microsoft-nni, JSON, and YAML files.

We also implement two downstream classifiersLR_classification and SVM_classification in GCL.eval based on PyTorch and Scikit-learn respectively.

Moreover, based on PyTorch Geometric, we provide functions for loading common node and graph datasets. You can useload_node_dataset and load_graph_dataset in utils.py.

Owner
GCL: Graph Contrastive Learning Library for PyTorch
GCL: Graph Contrastive Learning Library for PyTorch
Tensorflow 2 implementation of our high quality frame interpolation neural network

FILM: Frame Interpolation for Large Scene Motion Project | Paper | YouTube | Benchmark Scores Tensorflow 2 implementation of our high quality frame in

Google Research 1.6k Dec 28, 2022
Jittor Medical Segmentation Lib -- The assignment of Pattern Recognition course (2021 Spring) in Tsinghua University

THU模式识别2021春 -- Jittor 医学图像分割 模型列表 本仓库收录了课程作业中同学们采用jittor框架实现的如下模型: UNet SegNet DeepLab V2 DANet EANet HarDNet及其改动HarDNet_alter PSPNet OCNet OCRNet DL

48 Dec 26, 2022
AutoML library for deep learning

Official Website: autokeras.com AutoKeras: An AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras

Keras 8.7k Jan 08, 2023
Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Şebnem 6 Jan 18, 2022
Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

(Brian) JoonHo Lee 24 Dec 12, 2022
A model that attempts to learn and benefit from data collected on card counting.

A model that attempts to learn and benefit from data collected on card counting. A decision tree like model is built to win more often than loose and increase the bet of the player appropriately to c

1 Dec 17, 2021
A curated list and survey of awesome Vision Transformers.

English | 简体中文 A curated list and survey of awesome Vision Transformers. You can use mind mapping software to open the mind mapping source file. You c

OpenMMLab 281 Dec 21, 2022
Continuous Time LiDAR odometry

CT-ICP: Elastic SLAM for LiDAR sensors This repository implements the SLAM CT-ICP (see our article), a lightweight, precise and versatile pure LiDAR o

385 Dec 29, 2022
[MedIA2021]MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning

MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning [MedIA or Arxiv] and [Demo] This repository pr

Healthcare Intelligence Laboratory 92 Dec 08, 2022
A transformer which can randomly augment VOC format dataset (both image and bbox) online.

VocAug It is difficult to find a script which can augment VOC-format dataset, especially the bbox. Or find a script needs complex requirements so it i

Coder.AN 1 Mar 05, 2022
Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma This repo provi

Jingtao Zhan 99 Dec 27, 2022
repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments

repro_eval repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments. The measures were d

IR Group at Technische Hochschule Köln 9 May 25, 2022
Raindrop strategy for Irregular time series

Graph-Guided Network For Irregularly Sampled Multivariate Time Series Overview This repository contains processed datasets and implementation code for

Zitnik Lab @ Harvard 74 Jan 03, 2023
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Mask R-CNN for Object Detection and Segmentation This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bound

Matterport, Inc 22.5k Jan 04, 2023
My solution for the 7th place / 245 in the Umoja Hack 2022 challenge

Umoja Hack 2022 : Insurance Claim Challenge My solution for the 7th place / 245 in the Umoja Hack 2022 challenge Umoja Hack Africa is a yearly hackath

Souames Annis 17 Jun 03, 2022
The code for "Deep Level Set for Box-supervised Instance Segmentation in Aerial Images".

Deep Levelset for Box-supervised Instance Segmentation in Aerial Images Wentong Li, Yijie Chen, Wenyu Liu, Jianke Zhu* Any questions or discussions ar

sunshine.lwt 112 Jan 05, 2023
Learning Open-World Object Proposals without Learning to Classify

Learning Open-World Object Proposals without Learning to Classify Pytorch implementation for "Learning Open-World Object Proposals without Learning to

Dahun Kim 149 Dec 22, 2022
Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.

MAUVE MAUVE is a library built on PyTorch and HuggingFace Transformers to measure the gap between neural text and human text with the eponymous MAUVE

Krishna Pillutla 182 Jan 02, 2023
Learning to Prompt for Continual Learning

Learning to Prompt for Continual Learning (L2P) Official Jax Implementation L2P is a novel continual learning technique which learns to dynamically pr

Google Research 207 Jan 06, 2023
RefineGNN - Iterative refinement graph neural network for antibody sequence-structure co-design (RefineGNN)

Iterative refinement graph neural network for antibody sequence-structure co-des

Wengong Jin 83 Dec 31, 2022