Statistical tests for the sequential locality of graphs

Last update: Nov 23, 2021

Related tags

Overview

Statistical tests for the sequential locality of graphs

You can assess the statistical significance of the sequential locality of an adjacency matrix (graph + vertex sequence) using sequential_locality.py.

This file also includes ORGM.py that generates an instance of the ordered random graph model (ORGM) [1] and spectral.py that yields an optimized vertex sequence based on the spectral ordering algorithms.

Please find Ref. [1] for the details of the statistical tests.

sequential_locality.py

sequential_locality.py executes statistical tests with respect to the sequential locality.

Simple example

import numpy as np
import igraph
import sequential_locality as seq

s = seq.SequentialLocality(
		g = igraph.Graph.Erdos_Renyi(n=20,m=80), 
		sequence = np.arange(20)
		)
s.H1()

{'H1': 1.0375,
 'z1': 0.5123475382979811,
 'H1 p-value (ER/ORGM)': 0.6957960998835012,
 'H1 p-value (random)': 0.7438939644617626,
 'bandwidth_opt': None}

Please find Demo.ipynb for more examples.

SequentialLocality

This is a class to be instantiated to assess the sequential locality.

Input parameters

Either g or edgelist must be provided as an input.

Parameter	Value	Default	Description
g	graph	None	Graph (undirected, unweighted, no self-loops) in igraph or graph-tool.
edgelist	list of tuples	None	Edgelist as a list of tuples.
sequence	1-dim array	None	Array (list or ndarray) indicating the vertex ordering. If provided, the vertex indices in the graph will be replaced based on `sequence` . If `sequence` is `None`, the intrinsic vertex indices in the graph or edgelist will be used as the `sequence` .
format	`'igraph'` or `'graph-tool'`	`'igraph'`	Input graph format
simple	Boolean	True	If `True`, the graph is assumed to be a simple graph, otherwise the graph is assumed to be a multigraph.

H1

This is a method that returns H₁ and z₁ test statistics and p-values of the input data.

Input parameters

Parameter	Value	Default	Description
random_sequence	`'analytical'` or `'empirical'`	`'analytical'`	If `'analytical'` is selected, the p-value based on the normal approximation will be returned for the test of vertex sequence `H1 p-value (random)`. If `'empirical'` is selected, the p-value based on random sequences specified by `samples` will be returned.
n_samples	Integer	10,000	Number of samples to be drawn as a set of random sequences. This is used only when `random_sequence = 'empirical'`.
in_envelope	Boolean	`False`	If `False`, the p-value based on the ER model will be returned. If `True`, the p-value based on the ORGM will be returned. That is, the matrix elements outside of the bandwidth `r` will be ignored.
r	Integer	None	An integer between `1` and `N-1`. If provided, `r` will be used as the bandwidth when `in_envelope=True`.

Output parameters

Parameter	Description
H1	H₁ test statistic of the input data (graph & vertex sequence)
z1	z₁ test statistic of the input data
H1 p-value (ER/ORGM)	p-value under the null hypothesis of the ER random graph (when `in_envelope=False`) or the ORGM (when `in_envelope=True`).
H1 p-value (random)	p-value under the null hypothesis of random sequences
bandwidth_opt	Maximum likelihood estimate (MLE) of the bandwidth (when `r=None` in the input) or the input bandwidth `r`

HG

This is a method that returns H_G and z_G test statistics and p-values of the input data.

There is no in_envelope option for the test based on H_G.
random_sequence = 'analytical' can be computationally demanding.

Input parameters

Parameter	Value	Default	Description
random_sequence	`'analytical'` or `'empirical'`	`'empirical'`	If `'analytical'` is selected, the p-value based on the normal approximation will be returned for the test of vertex sequence `H1 p-value (random)`. If `'empirical'` is selected, the p-value based on random sequences specified by `samples` will be returned.
n_samples	Integer	10,000	Number of samples to be drawn as a set of random sequences. This is used only when `random_sequence = 'empirical'`.

Output parameters

Parameter	Description
HG	H_G test statistic of the input data (graph & vertex sequence)
zG	z_G test statistic of the input data
HG p-value (ER)	p-value under the null hypothesis of the ER random graph.
HG p-value (random)	p-value under the null hypothesis of random sequences

ORGM.py

ORGM.py is a random graph generator. It generates an ORGM [1] instance that has a desired strength of sequentially lcoal structure.

Simple example

import ORGM as orgm

edgelist, valid = orgm.ORGM(
	N=20, M=80, bandwidth=10, epsilon=0.25
	)

Input parameters

Parameter	Value	Default	Description
N	Integer	required input	Number of vertices
M	Integer	required input	Number of edges
bandwidth	Integer	required input	Bandwidth of the ORGM
epsilon	Float (in [0,1])	required input	Density ratio between the adjacency matrix elements inside & outside of the envelope. When `epsilon=1`, the ORGM becomes a uniform model. When `epsilon=0`, the nonzero matrix elements are strictly confined in the envelope.
simple	Boolean	`True`	If `True`, the graph is constrained to be simple. If `False`, the graph is allowed to have multiedges.

spectral.py

spectral.py is an implementation of the spectral ordering [2].

Simple example

import graph_tool.all as gt
import spectral

g_real = gt.collection.ns['karate/77']
inferred_sequence = spectral.spectral_sequence(
	g= g_real, 
	format='graph-tool'
	)

Parameter	Value	Default	Description
g	graph	required input	graph (undirected, unweighted, no self-loops) in igraph or graph-tool
normalized	Boolean	`True`	Normalized Laplacian (`True`) vs unnormalized (combinatorial) Laplacian (`False`)
format	`'igraph'` or `'graph-tool'`	`'igraph'`	Input graph format

Citation

Please use Ref. [1] for the citation of the present code.

References

[1] Tatsuro Kawamoto and Teruyoshi Kobayashi, "Sequential locality of graphs and its hypothesis testing," arXiv:2111.11267 (2021).
[2] Chris Ding and Xiaofeng He, "Linearized Cluster Assignment via Spectral Ordering," Proceedings of the Twenty-First International Conference on Machine Learning (ICML) (2004).

Statistical tests for the sequential locality of graphs

Related tags

Overview

Statistical tests for the sequential locality of graphs

sequential_locality.py

Simple example

SequentialLocality

Input parameters

H1

Input parameters

Output parameters

HG

Input parameters

Output parameters

ORGM.py

Simple example

Input parameters

spectral.py

Simple example

Citation

References

Owner

Travel through time in your tests.

Faker is a Python package that generates fake data for you.

pytest plugin for a better developer experience when working with the PyTorch test suite

Enabling easy statistical significance testing for deep neural networks.

Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.

FFPuppet is a Python module that automates browser process related tasks to aid in fuzzing

Auto Click by pyautogui and excel operations.

A Python Selenium library inspired by the Testing Library

Pynguin, The PYthoN General UnIt Test geNerator is a test-generation tool for Python

RAT-el is an open source penetration test tool that allows you to take control of a windows machine.

Silky smooth profiling for Django

Whatsapp messages bulk sender using Python Selenium.

Based on the selenium automatic test framework of python, the program crawls the score information of the educational administration system of a unive

This repository contains a set of benchmarks of different implementations of Parquet (storage format) <-> Arrow (in-memory format).

UUM Merit Form Filler is a web automation which helps automate entering a matric number to the UUM system in order for participants to obtain a merit

Selenium Manager

Hypothesis is a powerful, flexible, and easy to use library for property-based testing.

Repository for JIDA SNP Browser Web Application: Local Deployment

Python Webscraping using Selenium

The Good Old Days. | Testing Out A New Module-