constructing maps of intellectual influence from publication data

Overview

Influencemap Project @ ANU

Influence in the academic communities has been an area of interest for researchers. This can be seen in the popularity of applications like google scholar and the various metrics created for ranking papers, authors, conferences, etc.

We aim to provide a visualisation tool which allows users to easily search and visualise the flow of academic influence. Our visualisation maps influence in the form of an influence flower. We calculate influence as a function of the number of citations between two entities (look below for information on our definition of influence).

The node in the centre of the flower denotes the ego entity, the entitiy in which we are looking at influence with respect to. The leaf nodes are the most influential entities with respect to the ego. (We define the ego as a collection of papers. If it is an author, it is the collection of papers that the author has authored)

Each of the edges of the graph signifies the flow of influence to and from the ego node, the strength of this relation is reflected in the thickness of the edge. The red edges denote the influence the ego has towards the outer entities (an outer entity citing a paper by the ego). The blue edges denote the influence the outer entities have towards the ego (the ego cites a paper by one of the outer entities).

The colour of the outer nodes signifies the ratio of influence in and out. A blue node indicates that the associated entity has influenced the ego more than the ego has influenced itself. Likewise, a red node indicates the ego has influenced the node's entity more than it has influenced the ego.

We define two entities to be coauthors if the entities have contributed to the same paper. Coauthors of the ego are signified by nodes with greyed out names.

Data

We use the microsoft academic graph (MAG) dataset for our visualisation. The dataset is a large curation of publication indexed by Bing. From MAG, we use the following fields of the paper entries in the dataset,

  • Citation links
  • Authors
  • Conferences
  • Journals
  • Author Affiliations

Influence

To quantify academic influence, we define influence as a function of paper citations. Each citation which the ego is apart of contributes to the overall influence map of an ego. To prevent papers with a large number of entities contributing from creating an overwhelming amount of influence, we normalise the influence contribution by the number of entities in the cited paper.

For example, consider the following four paper database where we only consider entities which are authors.

Name Paper no. authors cites papers
John Smith Algorithms 2 [Linear Algebra]
John Smith Machine Learning 3 [Linear Algebra, Computation]
Maria Garcia Linear Algebra 2 None
Maria Garcia Computation 4 [Algorithms]

In this case John's influence on Maria is 0.5 (John's paper Algorithm's has a weight of 0.5 and was cited once by Maria).

On the other hand Maria's influence on John is 1.25 (Linear Algebra has a weight of 0.5 and it was cited twice by John, Computation has a weight of 0.25 and was cited once by John).

We aggregate the pairwise influence of entities associated with the papers of the ego to generate the nodes of a flower. Each flowers' outer nodes can be a collection of several types of entities. In our influence flower application, we present 4 different flower types:

  1. Author outer nodes
  2. Venue (conferences or journals) outer nodes
  3. Author Affiliation outer nodes
  4. Paper topic outer nodes

Filtering self-citations

We define a self-citation between papers and a cited paper as a relation dependent on the ego. A paper citation is a self-citation if both papers have the ego as an author (a venue, an institution, or a topic).

Filtering co-contributors

The Influence Flower is able to capture less obvious influence outside of one’s co-author networks with the filtering. We define two entities to be co-contributors if the entities have contributed to the same paper. For the venue type entity, co-contribution indicates if the ego has published a paper to the venue. For the topic type entity, it means that the ego has written a paper of the topic. Co-contributors of the ego are indicated by nodes with greyed out names.

Other candidate definitions of influence

We have described influence as the sum of citations from one person (or venue or affiliation) to another, weighted by the number of authors in the cited paper. Similar methods were considered early on in the project which included combinations of different weighting schemes. We looked at the eight combinations of three mutually exclusive weightings:

  1. Weighting by the number of authors on the citing paper;
  2. Weighting by the number of authors on the cited paper; and
  3. Weighting by the number of papers referenced by the citing paper.

Due to the lack of a ground truth value of influence to compare these definitions to, we evaluated the eight combinations of these weightings empirically by discussing with researchers which of the definitions produced flowers that most accurately reflected their opinions of who they have influenced and been influenced by.

Other definitions of influence which have not yet been explored with this data include existing measures for node centrality in graphs. By using citation data from MAG to define a directed graph where nodes represent authors, venues or affiliations, and edges are derived from citations between nodes, we could explore using metrics such as closeness, betweenness and eigenvector centrality. These metrics are more appropriate for defining the influence of an entity relative to the whole network.

Owner
CS Metrics
CS Metrics
A Python-based development platform for automated trading systems - from backtesting to optimisation to livetrading.

AutoTrader AutoTrader is Python-based platform intended to help in the development, optimisation and deployment of automated trading systems. From sim

Kieran Mackle 485 Jan 09, 2023
Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

RfD-Net [Project Page] [Paper] [Video] RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction Yinyu Nie, Ji Hou, Xiaoguang Han, Matthi

Yinyu Nie 162 Jan 06, 2023
This project intends to use SVM supervised learning to determine whether or not an individual is diabetic given certain attributes.

Diabetes Prediction Using SVM I explore a diabetes prediction algorithm using a Diabetes dataset. Using a Support Vector Machine for my prediction alg

Jeff Shen 1 Jan 14, 2022
BLEND: A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applic

SAFARI Research Group at ETH Zurich and Carnegie Mellon University 19 Dec 26, 2022
The official repository for "Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning" paper.

Intermdiate layer matters - SSL The official repository for "Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning" paper. Downl

Aakash Kaku 35 Sep 19, 2022
Official Python implementation of the 'Sparse deconvolution'-v0.3.0

Sparse deconvolution Python v0.3.0 Official Python implementation of the 'Sparse deconvolution', and the CPU (NumPy) and GPU (CuPy) calculation backen

Weisong Zhao 23 Dec 28, 2022
HGCAE Pytorch implementation. CVPR2021 accepted.

Hyperbolic Graph Convolutional Auto-Encoders Accepted to CVPR2021 🎉 Official PyTorch code of Unsupervised Hyperbolic Representation Learning via Mess

Junho Cho 37 Nov 13, 2022
Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach

Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach This is the implementation of traffic prediction code in DTMP based on PyTo

chenxin 1 Dec 19, 2021
Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Duong H. Le 18 Jun 13, 2022
Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

JeffLi 347 Dec 24, 2022
[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods Large Scale Learning on Non-Homophilous Graphs: New Benchmark

60 Jan 03, 2023
StarGAN - Official PyTorch Implementation (CVPR 2018)

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Yunjey Choi 5.1k Dec 30, 2022
Semi-supervised Stance Detection of Tweets Via Distant Network Supervision

SANDS This is an annonymous repository containing code and data necessary to reproduce the results published in "Semi-supervised Stance Detection of T

2 Sep 22, 2022
A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling"

SelfGNN A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling" paper, which will appear in Th

Zekarias Tilahun 24 Jun 21, 2022
Open-sourcing the Slates Dataset for recommender systems research

FINN.no Recommender Systems Slate Dataset This repository accompany the paper "Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sa

FINN.no 48 Nov 28, 2022
This framework implements the data poisoning method found in the paper Adversarial Examples Make Strong Poisons

Adversarial poison generation and evaluation. This framework implements the data poisoning method found in the paper Adversarial Examples Make Strong

31 Nov 01, 2022
A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Splitter ⠀⠀ A PyTorch implementation of Splitter: Learning Node Representations that Capture Multiple Social Contexts (WWW 2019). Abstract Recent inte

Benedek Rozemberczki 201 Nov 09, 2022
A list of all papers and resoureces on Semantic Segmentation

Semantic-Segmentation A list of all papers and resoureces on Semantic Segmentation. Dataset importance SemanticSegmentation_DL Some implementation of

Alan Tang 1.1k Dec 12, 2022
Adaptive Denoising Training (ADT) for Recommendation.

DenoisingRec Adaptive Denoising Training for Recommendation. This is the pytorch implementation of our paper at WSDM 2021: Denoising Implicit Feedback

Wenjie Wang 51 Dec 30, 2022
Pytorch and Torch testing code of CartoonGAN

CartoonGAN-Test-Pytorch-Torch Pytorch and Torch testing code of CartoonGAN [Chen et al., CVPR18]. With the released pretrained models by the authors,

Yijun Li 642 Dec 27, 2022