Bayesian inference for Permuton-induced Chinese Restaurant Process (NeurIPS2021).

Overview

Permuton-induced Chinese Restaurant Process

animationMCMCepinions

Note: Currently only the Matlab version is available, but a Python version will be available soon!

This is a demo code for Bayesian nonparametric relational data analysis based on Permuton-induced Chinese Restaurant Process (NeurIPS, 2021). The key features are listed as follows:

  • Clustering based on rectangular partitioning: For an input matrix, the algorithm probabilistically searches for the row and column order and rectangular partitioning so that similar elements are clustered in each block as much as possible.
  • Infinite model complexity: There is no need to fix the suitable number of rectangle clusters in advance, which is a fundamental principle of Bayesian nonparametric machine learning.
  • Arbitrary rectangular partitioning: It can potentially obtain a posterior distribution on arbitrary rectangular partitioning with any numbers of rectangle blocks.
  • Empirically faster mixing of Markov chain Monte Carlo (MCMC) iterations: The method most closely related to this algorithm is the Baxter Permutation Process (NeurIPS, 2020). Typically, this algorithm seems to be able to mix MCMC faster than the Baxter permutation process empirically.

You will need a basic MATLAB installation with Statistics and Machine Learning Toolbox.

In a nutshell

  1. cd permuton-induced-crp
  2. run

Then, the MCMC evolution will appear like the gif animation at the top of this page. The following two items are particularly noteworthy.

  • Top center: Probabilistic rectangular partitioning of a sample matrix (irmdata\sampledata.mat ).
  • Bottom right: Posterior probability.

Interpretation of analysis results

model

The details of the visualization that will be drawn while running the MCMC iterations require additional explanation of our model. Please refer to the paper for more details. Our model, an extension of the Chinese Restaurant Process (CRP), consists of a generative probabilistic model as shown in the figure above (taken from the original paper). While the standard CRP achieves sequence clustering by the analogy of placing customers (data) on tables (clusters), our model additionally achieves array clustering by giving the random table coordinates on [0,1]x[0,1] drawn from the permuton. By viewing the table coordinates as a geometric representation of a permutation, we can use the permutation-to-rectangulation transformation to obtain a rectangular partition of the matrix.

  • Bottom center: Random coordinates of the CRP tables on [0,1]x[0,1]. The size of each table (circle) reflects the number of customers sitting at that table.
  • Top left: Diagonal rectangulation corresponding to the permutation represented by the table coordinates.
  • Bottom left: Generic rectangulation corresponding to the permutation represented by the table coordinates.

Details of usage

Given an input relational matrix, the Permuton-induced Chinese Restaurant Process can be fitted to it by a MCMC inference algorithm as follows:

[RowTable, ColumnTable, TableCoordinates, nesw] = test_MCMC_PCRP(X);

or

[RowTable, ColumnTable, TableCoordinates, nesw] = test_MCMC_PCRP(X, opt);

  • X: An M by N input observation matrix. Each element must be natural numbers.
  • opt.maxiter: Maximum number of MCMC iterations.
  • opt.missingRatio: Ratio of test/(training+test) for prediction performance evaluation based on perplexity.

Reference

  1. M. Nakano, Yasuhiro Fujiwara, A. Kimura, T. Yamada, and N. Ueda, 'Permuton-induced Chinese Restaurant Process,' Advances in Neural Information Processing Systems 34 (NeurIPS 2021).

    @inproceedings{Nakano2021,
     author = {Nakano, Masahiro and Fujiwara, Yasuhiro and Kimura, Akisato and Yamada, Takeshi and Ueda, Naonori},
     booktitle = {Advances in Neural Information Processing Systems},
     pages = {},
     publisher = {Curran Associates, Inc.},
     title = {Permuton-induced Chinese Restaurant Process},
     url = {},
     volume = {34},
     year = {2021}
    }
    
Owner
NTT Communication Science Laboratories
NTT Communication Science Laboratories
Implementation of Neonatal Seizure Detection using EEG signals for deploying on edge devices including Raspberry Pi.

NeonatalSeizureDetection Description Link: https://arxiv.org/abs/2111.15569 Citation: @misc{nagarajan2021scalable, title={Scalable Machine Learn

Vishal Nagarajan 11 Nov 08, 2022
Code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge.

Open Sesame This repository contains the code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge. Credits We built the project on t

9 Jul 24, 2022
Several simple examples for popular neural network toolkits calling custom CUDA operators.

Neural Network CUDA Example Several simple examples for neural network toolkits (PyTorch, TensorFlow, etc.) calling custom CUDA operators. We provide

WeiYang 798 Jan 01, 2023
A Python library for Deep Probabilistic Modeling

Abstract DeeProb-kit is a Python library that implements deep probabilistic models such as various kinds of Sum-Product Networks, Normalizing Flows an

DeeProb-org 46 Dec 26, 2022
Deep Federated Learning for Autonomous Driving

FADNet: Deep Federated Learning for Autonomous Driving Abstract Autonomous driving is an active research topic in both academia and industry. However,

AIOZ AI 12 Dec 01, 2022
Bayesian algorithm execution (BAX)

Bayesian Algorithm Execution (BAX) Code for the paper: Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mut

Willie Neiswanger 38 Dec 08, 2022
Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)

InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization Authors: Fan-yun Sun, Jordan Hoffm

Fan-Yun Sun 232 Dec 28, 2022
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 46.9k Jan 03, 2023
MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page] Anh-Quan Cao, Rao

298 Jan 08, 2023
CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

CurriculumNet Introduction This repo contains related code and models from the ECCV 2018 CurriculumNet paper. CurriculumNet is a new training strategy

156 Jul 04, 2022
OpenCVのGrabCut()を利用したセマンティックセグメンテーション向けアノテーションツール(Annotation tool using GrabCut() of OpenCV. It can be used to create datasets for semantic segmentation.)

[Japanese/English] GrabCut-Annotation-Tool GrabCut-Annotation-Tool.mp4 OpenCVのGrabCut()を利用したアノテーションツールです。 セマンティックセグメンテーション向けのデータセット作成にご使用いただけます。 ※Grab

KazuhitoTakahashi 30 Nov 18, 2022
Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

Graph Evolving Meta-Learning for Low-resource Medical Dialogue Generation Code to be further cleaned... This repo contains the code of the following p

Shuai Lin 29 Nov 01, 2022
NAACL2021 - COIL Contextualized Lexical Retriever

COIL Repo for our NAACL paper, COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List. The code covers learning

Luyu Gao 108 Dec 31, 2022
A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi

LSTM-Time-Series-Prediction A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi Contest. The Link of the Cont

KevinCHEN 1 Jun 13, 2022
Datasets and pretrained Models for StyleGAN3 ...

Datasets and pretrained Models for StyleGAN3 ... Dear arfiticial friend, this is a collection of artistic datasets and models that we have put togethe

lucid layers 34 Oct 06, 2022
Einshape: DSL-based reshaping library for JAX and other frameworks.

Einshape: DSL-based reshaping library for JAX and other frameworks. The jnp.einsum op provides a DSL-based unified interface to matmul and tensordot o

DeepMind 62 Nov 30, 2022
Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021)

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba

huangqiusheng 8 Jul 13, 2022
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 02, 2023
Problem-943.-ACMP - Problem 943. ACMP

Problem-943.-ACMP В "main.py" расположен вариант моего решения задачи 943 с серв

Konstantin Dyomshin 2 Aug 19, 2022
A toolset of Python programs for signal modeling and indentification via sparse semilinear autoregressors.

SPAAR Description A toolset of Python programs for signal modeling via sparse semilinear autoregressors. References Vides, F. (2021). Computing Semili

Fredy Vides 0 Oct 30, 2021