Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

Overview

ADGC: Awesome Deep Graph Clustering

ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets). Any other interesting papers and codes are welcome. Any problems, please contact [email protected].

Made with Python GitHub stars GitHub forks visitors


What's Deep Graph Clustering?

Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different groups, has attracted intensive attention in recent years.

Important Survey Papers

Papers

  1. K-Means: "Algorithm AS 136: A k-means clustering algorithm" [pdf|code]
  2. DCN (ICML17): "Towards k-means-friendly spaces: Simultaneous deep learning and clustering" [pdf|code]
  3. DEC (ICML16): "Unsupervised Deep Embedding for Clustering Analysis" [pdf|code]
  4. IDEC (IJCAI17): "Improved Deep Embedded Clustering with Local Structure Preservation" [pdf|code]
  5. GAE/VGAE : "Variational Graph Auto-Encoders" [pdf|code]
  6. DAEGC (IJCAI19): "Attributed Graph Clustering: A Deep Attentional Embedding Approach" [pdf|code]
  7. ARGA/ARVGA (TCYB19): "Learning Graph Embedding with Adversarial Training Methods" [pdf|code]
  8. SDCN/SDCN_Q (WWW20): "Structural Deep Clustering Network" [pdf|code]
  9. DFCN (AAAI21): "Deep Fusion Clustering Network" [pdf|code]
  10. MVGRL (ICML20): "Contrastive Multi-View Representation Learning on Graphs" [pdf|code]

Benchmark Datasets

We divide the datasets into two categories, i.e. graph datasets and non-graph datasets. Graph datasets are some graphs in real-world, such as citation networks, social networks and so on. Non-graph datasets are NOT graph type. However, if necessary, we could construct "adjacency matrices" by K-Nearest Neighbors (KNN) algorithm.

Quick Start

  • Step1: Download all datasets from [Google Drive|Baidu Netdisk]. Optionally, download some of them from URLs in the tables (Google Drive)

  • Step2: Unzip them to ./dataset/

  • Step3: Run the ./dataset/utils.py

    Two functions load_graph_data and load_data are provided in ./dataset/utils.py to load graph datasets and non-graph datasets, respectively.

Datasets Details

  1. Graph Datasets

    Dataset Samples Dimension Edges Classes URL
    DBLP 4057 334 3528 4 dblp.zip
    CITE 3327 3703 4552 6 cite.zip
    ACM 3025 1870 13128 3 acm.zip
    AMAP 7650 745 119081 8 amap.zip
    AMAC 13752 767 245861 10 amac.zip
    PUBMED 19717 500 44325 3 pubmed.zip
    CORAFULL 19793 8710 63421 70 corafull.zip
    CORA 2708 1433 6632 7 cora.zip
    CITESEER 3327 3703 6215 6 citeseer.zip
  2. Non-graph Datasets

    Dataset Samples Dimension Type Classes URL
    USPS 9298 256 Image 10 usps.zip
    HHAR 10299 561 Record 6 hhar.zip
    REUT 10000 2000 Text 4 reut.zip

If you find this repository useful to your research or work, it is really appreciate to star this repository.​ ❤️

Owner
yueliu1999
Yue Liu is pursuing his master degree in College of Computer, NUDT. His current research interests include GNN, deep clustering and self-supervised learning.
yueliu1999
DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

[ICLR'21] DARTS-: Robustly Stepping out of Performance Collapse Without Indicators [openreview] Authors: Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun

55 Nov 01, 2022
Source code for our paper "Do Not Trust Prediction Scores for Membership Inference Attacks"

Do Not Trust Prediction Scores for Membership Inference Attacks Abstract: Membership inference attacks (MIAs) aim to determine whether a specific samp

<a href=[email protected]"> 3 Oct 25, 2022
Continual learning with sketched Jacobian approximations

Continual learning with sketched Jacobian approximations This repository contains the code for reproducing figures and results in the paper ``Provable

Machine Learning and Information Processing Laboratory 1 Jun 30, 2022
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

Baleen Baleen is a state-of-the-art model for multi-hop reasoning, enabling scalable multi-hop search over massive collections for knowledge-intensive

Stanford Future Data Systems 22 Dec 05, 2022
Code for Multimodal Neural SLAM for Interactive Instruction Following

Code for Multimodal Neural SLAM for Interactive Instruction Following Code structure The code is adapted from E.T. and most training as well as data p

7 Dec 07, 2022
Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

mae-repo PyTorch re-implememtation of "masked autoencoders are scalable vision learners". In this repo, it heavily borrows codes from codebase https:/

Peng Qiao 1 Dec 14, 2021
Neural Scene Flow Fields using pytorch-lightning, with potential improvements

nsff_pl Neural Scene Flow Fields using pytorch-lightning. This repo reimplements the NSFF idea, but modifies several operations based on observation o

AI葵 178 Dec 21, 2022
Repo for Photon-Starved Scene Inference using Single Photon Cameras, ICCV 2021

Photon-Starved Scene Inference using Single Photon Cameras ICCV 2021 Arxiv Project Video Bhavya Goyal, Mohit Gupta University of Wisconsin-Madison Abs

Bhavya Goyal 5 Nov 15, 2022
Rank 3 : Source code for OPPO 6G Data Generation Challenge

OPPO 6G Data Generation with an E2E Framework Homepage of OPPO 6G Data Generation Challenge Datasets H1_32T4R.mat H2_32T4R.mat Please put the original

Sen Pei 97 Jan 07, 2023
A simple, fully convolutional model for real-time instance segmentation.

You Only Look At CoefficienTs ██╗ ██╗ ██████╗ ██╗ █████╗ ██████╗████████╗ ╚██╗ ██╔╝██╔═══██╗██║ ██╔══██╗██╔════╝╚══██╔══╝ ╚██

Daniel Bolya 4.6k Dec 30, 2022
MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

pytorch-made This code is an implementation of "Masked AutoEncoder for Density Estimation" by Germain et al., 2015. The core idea is that you can turn

Andrej 498 Dec 30, 2022
Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

(CVPR 2022) Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning ArXiv This repo contains Official Implementat

Yujun Shi 24 Nov 01, 2022
This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability.

Delayed-cellular-neural-network This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability. There is als

4 Apr 28, 2022
Ranger - a synergistic optimizer using RAdam (Rectified Adam), Gradient Centralization and LookAhead in one codebase

Ranger-Deep-Learning-Optimizer Ranger - a synergistic optimizer combining RAdam (Rectified Adam) and LookAhead, and now GC (gradient centralization) i

Less Wright 1.1k Dec 21, 2022
Goal of the project : Detecting Temporal Boundaries in Sign Language videos

MVA RecVis course final project : Goal of the project : Detecting Temporal Boundaries in Sign Language videos. Sign language automatic indexing is an

Loubna Ben Allal 6 Dec 21, 2022
Official PyTorch Implementation of paper "NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting", EGSR 2021.

NeLF: Neural Light-transport Field for Single Portrait View Synthesis and Relighting Official PyTorch Implementation of paper "NeLF: Neural Light-tran

Ken Lin 38 Dec 26, 2022
A TikTok-like recommender system for GitHub repositories based on Gorse

GitRec GitRec is the missing recommender system for GitHub repositories based on Gorse. Architecture The trending crawler crawls trending repositories

337 Jan 04, 2023
FMA: A Dataset For Music Analysis

FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information

Michaël Defferrard 1.8k Dec 29, 2022
Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Python Streaming Anomaly Detection (PySAD) PySAD is an open-source python framework for anomaly detection on streaming multivariate data. Documentatio

Selim Firat Yilmaz 181 Dec 18, 2022
Library for 8-bit optimizers and quantization routines.

bitsandbytes Bitsandbytes is a lightweight wrapper around CUDA custom functions, in particular 8-bit optimizers and quantization functions. Paper -- V

Facebook Research 687 Jan 04, 2023