Graph Robustness Benchmark: A scalable, unified, modular, and reproducible benchmark for evaluating the adversarial robustness of Graph Machine Learning.

Overview

GRB

PyPi Latest Release Documentation Status License

Homepage | Paper | Datasets | Leaderboard | Documentation

Graph Robustness Benchmark (GRB) provides scalable, unified, modular, and reproducible evaluation on the adversarial robustness of graph machine learning models. GRB has elaborated datasets, unified evaluation pipeline, modular coding framework, and reproducible leaderboards, which facilitate the developments of graph adversarial learning, summarizing existing progress and generating insights into future research.

Updates

Get Started

Installation

Install grb via pip:

pip install grb

Install grb via git:

git clone [email protected]:THUDM/grb.git
cd grb
pip install -e .

Preparation

GRB provides all necessary components to ensure the reproducibility of evaluation results. Get datasets from link or download them by running the following script:

cd ./scripts
sh download_dataset.sh

Get attack results (adversarial adjacency matrix and features) from link or download them by running the following script:

sh download_attack_results.sh

Get saved models (model weights) from link or download them by running the following script:

sh download_saved_models.sh

Usage of GRB Modules

Training a GML model

An example of training Graph Convolutional Network (GCN) on grb-cora dataset.

import torch  # pytorch backend
from grb.dataset import Dataset
from grb.model.torch import GCN
from grb.trainer.trainer import Trainer

# Load data
dataset = Dataset(name='grb-cora', mode='easy',
                  feat_norm='arctan')
# Build model
model = GCN(in_features=dataset.num_features,
            out_features=dataset.num_classes,
            hidden_features=[64, 64])
# Training
adam = torch.optim.Adam(model.parameters(), lr=0.01)
trainer = Trainer(dataset=dataset, optimizer=adam,
                  loss=torch.nn.functional.nll_loss)
trainer.train(model=model, n_epoch=200, dropout=0.5,
              train_mode='inductive')

Adversarial attack

An example of applying Topological Defective Graph Injection Attack (TDGIA) on trained GCN model.

from grb.attack.injection.tdgia import TDGIA

# Attack configuration
tdgia = TDGIA(lr=0.01, 
              n_epoch=10,
              n_inject_max=20, 
              n_edge_max=20,
              feat_lim_min=-0.9, 
              feat_lim_max=0.9,
              sequential_step=0.2)
# Apply attack
rst = tdgia.attack(model=model,
                   adj=dataset.adj,
                   features=dataset.features,
                   target_mask=dataset.test_mask)
# Get modified adj and features
adj_attack, features_attack = rst

GRB Evaluation

Evaluation scenario (Injection Attack)

GRB

GRB provides a unified evaluation scenario for fair comparisons between attacks and defenses. The scenario is Black-box, Evasion, Inductive, Injection. Take the case of a citation-graph classification system for example. The platform collects labeled data from previous papers and trains a GML model. When a batch of new papers are submitted, it updates the graph and uses the trained model to predict labels for them.

  • Black-box: Both the attacker and the defender have no knowledge about the applied methods each other uses.
  • Evasion: Models are already trained in trusted data (e.g. authenticated users), which are untouched by the attackers but might have natural noises. Thus, attacks will only happen during the inference phase.
  • Inductive: Models are used to classify unseen data (e.g. new users), i.e. validation or test data are unseen during training, which requires models to generalize to out of distribution data.
  • Injection: The attackers can only inject new nodes but not modify the target nodes directly. Since it is usually hard to hack into users' accounts and modify their profiles. However, it is easier to create fake accounts and connect them to existing users.

GRB Leaderboard

GRB maintains leaderboards that permits a fair comparision across various attacks and defenses. To ensure the reproducibility, we provide all necessary information including datasets, attack results, saved models, etc. Besides, all results on the leaderboards can be easily reproduced by running the following scripts (e.g. leaderboard for grb-cora dataset):

sh run_leaderboard_pipeline.sh -d grb-cora -g 0 -s ./leaderboard -n 0
Usage: run_leaderboard_pipeline.sh [-d <string>] [-g <int>] [-s <string>] [-n <int>]
Pipeline for reproducing leaderboard on the chosen dataset.
    -h      Display help message.
    -d      Choose a dataset.
    -s      Set a directory to save leaderboard files.
    -n      Choose the number of an attack from 0 to 9.
    -g      Choose a GPU device. -1 for CPU.

Submission

We welcome researchers to submit new methods including attacks, defenses, or new GML models to enrich the GRB leaderboard. For future submissions, one should follow the GRB Evaluation Rules and respect the reproducibility.

Please submit your methods via the google form GRB submission. Our team will verify the result within a week.

Requirements

  • scipy==1.5.2
  • numpy==1.19.1
  • torch==1.8.0
  • networkx==2.5
  • pandas~=1.2.3
  • cogdl~=0.3.0.post1
  • scikit-learn~=0.24.1

Citing GRB

Please cite our paper if you find GRB useful for your research:

@article{zheng2021grb,
  title={Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning},
  author={Zheng, Qinkai and Zou, Xu and Dong, Yuxiao and Cen, Yukuo and Yin, Da and Xu, Jiarong and Yang, Yang and Tang, Jie},
  journal={Neural Information Processing Systems Track on Datasets and Benchmarks 2021},
  year={2021}
}

Contact

In case of any problem, please contact us via email: [email protected]. We also welcome researchers to join our Google Group for further discussion on the adversarial robustness of graph machine learning.

Comments
  • Issue on Duplicating Linked Nodes in PGD

    Issue on Duplicating Linked Nodes in PGD

    Hi GRB Team,

    When using the latest GRB codebase, I found an issue in your implementation of random injection. For example, in /attack/PGD.py, an array islinked is created but never used, which would lead to repeated connections and hence producing an adj_attack with fewer injected edges. May I know whether it is intended or a mistake? Thank you. 😀

    opened by LFhase 2
  • Bump numpy from 1.19.1 to 1.22.0

    Bump numpy from 1.19.1 to 1.22.0

    Bumps numpy from 1.19.1 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • release of model class codes?

    release of model class codes?

    Hi GRB team,

    I want to modify, e.g., add new layers, and fine-tune the existing robust models listed in the leaderboard. It would make things much easier if I can access these models' class codes i.e., model definitions. Wonder where I can download them?

    Thanks very much for your help! Best, Yang

    opened by songy0123 0
  • Can't reach the accuracy of leaderboard

    Can't reach the accuracy of leaderboard

    Hi, I tried to use the pipeline to reproduce the result of GRB leaderboard but can't reach the accuracy given by the paper and grb website. There is always a 2-5% gap between the paper and my experiment. Could you please provide the full code for reproducing?

    opened by jiqianwanbaichi 4
  • Import error Trainer in Train Pipeline

    Import error Trainer in Train Pipeline

    Hi,

    the following line throws an error:

    https://github.com/THUDM/grb/blob/master/pipeline/train_pipeline.py#L8

    Traceback (most recent call last):
      File "/nfs/homedirs/geisler/code/grb/pipeline/train_pipeline.py", line 8, in <module>
        from grb.utils import Trainer, Logger
    ImportError: cannot import name 'Trainer' from 'grb.utils' (/nfs/homedirs/geisler/code/grb/grb/utils/__init__.py)
    
    opened by sigeisler 1
Releases(v0.1.0)
  • v0.1.0(Aug 5, 2021)

    The first release of Graph Robustness Benchmark (GRB).

    • API based on pure PyTorch, CogDL, and DGL.
    • Include five graph datasets of different scales.
    • Support graph injection attacks (e.g., RND, FGSM, PGS, SPEIT, TDGIA).
    • Support adversarial defenses (e.g., layer normalization, adversarial training, GNNSVD, GNNGuard).
    • Provide homepage.
    • Provide leaderboards of all datasets.
    • Provide basic documentation.
    • Provide scripts for reproducing results.
    Source code(tar.gz)
    Source code(zip)
Owner
THUDM
Data Mining Research Group at Tsinghua University
THUDM
Implementation of Fast Transformer in Pytorch

Fast Transformer - Pytorch Implementation of Fast Transformer in Pytorch. This only work as an encoder. Yannic video AI Epiphany Install $ pip install

Phil Wang 167 Dec 27, 2022
Autoencoders pretraining using clustering

Autoencoders pretraining using clustering

IITiS PAN 2 Dec 16, 2021
Keep CALM and Improve Visual Feature Attribution

Keep CALM and Improve Visual Feature Attribution Jae Myung Kim1*, Junsuk Choe1*, Zeynep Akata2, Seong Joon Oh1† * Equal contribution † Corresponding a

NAVER AI 90 Dec 07, 2022
LibFewShot: A Comprehensive Library for Few-shot Learning.

LibFewShot Make few-shot learning easy. Supported Methods Meta MAML(ICML'17) ANIL(ICLR'20) R2D2(ICLR'19) Versa(NeurIPS'18) LEO(ICLR'19) MTL(CVPR'19) M

<a href=[email protected]&L"> 603 Jan 05, 2023
Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

AdversarialTexture Adversarial Texture Optimization from RGB-D Scans (CVPR 2020). Scanning Data Download Please refer to data directory for details. B

Jingwei Huang 153 Nov 28, 2022
Python package to add text to images, textures and different backgrounds

nider Python package for text images generation and watermarking Free software: MIT license Documentation: https://nider.readthedocs.io. nider is an a

Vladyslav Ovchynnykov 131 Dec 30, 2022
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Meta Research 663 Jan 06, 2023
A highly efficient and modular implementation of Gaussian Processes in PyTorch

GPyTorch GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian

3k Jan 02, 2023
Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation

Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation Introduction ACoSP is an online pruning algorithm that compr

Merantix 8 Dec 07, 2022
Exact Pareto Optimal solutions for preference based Multi-Objective Optimization

Exact Pareto Optimal solutions for preference based Multi-Objective Optimization

Debabrata Mahapatra 40 Dec 24, 2022
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022
Tensorflow 2.x implementation of Panoramic BlitzNet for object detection and semantic segmentation on indoor panoramic images.

Deep neural network for object detection and semantic segmentation on indoor panoramic images. The implementation is based on the papers:

Alejandro de Nova Guerrero 9 Nov 24, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 08, 2022
Code for paper: "Spinning Language Models for Propaganda-As-A-Service"

Spinning Language Models for Propaganda-As-A-Service This is the source code for the Arxiv version of the paper. You can use this Google Colab to expl

Eugene Bagdasaryan 16 Jan 03, 2023
All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.

Data Structures and Algorithms Python INDEX 1. Resources - Books Data Structures - Reema Thareja competitiveCoding Big-O Cheat Sheet DAA Syllabus Inte

Shushrut Kumar 129 Dec 15, 2022
Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert

Akash Sengupta 149 Dec 14, 2022
[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2 

Fangzhou Hong 749 Jan 04, 2023
OpenMMLab 3D Human Parametric Model Toolbox and Benchmark

Introduction English | 简体中文 MMHuman3D is an open source PyTorch-based codebase for the use of 3D human parametric models in computer vision and comput

OpenMMLab 782 Jan 04, 2023
A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

298 Jan 07, 2023
Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors,

24 Oct 26, 2022