This repository holds the code for the paper "Deep Conditional Gaussian Mixture Model forConstrained Clustering".

Related tags

Deep LearningDC-GMM
Overview

Deep Conditional Gaussian Mixture Model for Constrained Clustering.

This repository holds the code for the paper Deep Conditional Gaussian Mixture Model for Constrained Clustering.

Motivation

Clustering with constraints has gained significant attention in the field of constrained machine learning as it can leverage partial prior information on a growing amount of unlabelled data. Following recent advances in deep generative models, we derive a novel probabilistic approach to constrained clustering that can be trained efficiently in the framework of stochastic gradient variational Bayes. In contrast to existing approaches, our model (DC-GMM) uncovers the underlying distribution of the data conditioned on prior clustering preferences, expressed as \textit{pairwise constraints}. The inclusion of such constraints allows the user to drive the clustering process towards a desirable configuration by indicating which samples should or should not belong to the same class.

Data Download

To download Reuters data, run the following:

cd dataset/reuters

sh download_data.sh

Download STL data (Matlab files) from https://cs.stanford.edu/~acoates/stl10/. Save them in dataset/stl10/stl10_matlab. Then run the following:

cd dataset/stl10

python compute_stl_features.py

To download and configure the UTKFace datset:

Implementation

To run DC-GMM using the default setting on MNIST data set:

python main.py --pretrain True

To run DC-GMM without pairwise constraints using the default setting:

python main.py --pretrain True --num_constrains 0

To choose different configurations of the hyper-parameters:

python main.py --data ... num_constrains ... --alpha ... --lr ...

Important hyper-parameters:

  • data: choose from MNIST, fMNIST, Reuters, har, utkface
  • num_constrains: by default it should be set to 6000 (note that the total number of pairwise constraints in a dataset is O(N*N))
  • alpha: measure the confidence in your labels (default is 10000)
  • pretrain: False if you want to use your own pretrain weights

Pairwise constraints

In the current implementation, the pairwise constraints are obtained from labels by randomly sampled two data points and assigning a must-link constraint (+1) if the two samples have the same label and a cannot-link constraint (-1) otherwise. The pairwise constraints are stored in a matrix W. See the file: source/data.py

Owner
Data Science doctoral student at ETH Zürich.
An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

GLOM TensorFlow This Python package attempts to implement GLOM in TensorFlow, which allows advances made by several different groups transformers, neu

Rishit Dagli 32 Feb 21, 2022
This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters.

openmc-plasma-source This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters. The OpenMC sources a

Fusion Energy 10 Oct 18, 2022
An example to implement a new backbone with OpenMMLab framework.

Backbone example on OpenMMLab framework English | 简体中文 Introduction This is an template repo about how to use OpenMMLab framework to develop a new bac

Ma Zerun 22 Dec 29, 2022
SynNet - synthetic tree generation using neural networks

SynNet This repo contains the code and analysis scripts for our amortized approach to synthetic tree generation using neural networks. Our model can s

Wenhao Gao 60 Dec 29, 2022
DilatedNet in Keras for image segmentation

Keras implementation of DilatedNet for semantic segmentation A native Keras implementation of semantic segmentation according to Multi-Scale Context A

303 Mar 15, 2022
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Clova AI Research 97 Dec 23, 2022
Simple ONNX operation generator. Simple Operation Generator for ONNX.

sog4onnx Simple ONNX operation generator. Simple Operation Generator for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools Key concept V

Katsuya Hyodo 6 May 15, 2022
Multi-Joint dynamics with Contact. A general purpose physics simulator.

MuJoCo Physics MuJoCo stands for Multi-Joint dynamics with Contact. It is a general purpose physics engine that aims to facilitate research and develo

DeepMind 5.2k Jan 02, 2023
Real-time pose estimation accelerated with NVIDIA TensorRT

trt_pose Want to detect hand poses? Check out the new trt_pose_hand project for real-time hand pose and gesture recognition! trt_pose is aimed at enab

NVIDIA AI IOT 803 Jan 06, 2023
Second-order Attention Network for Single Image Super-resolution (CVPR-2019)

Second-order Attention Network for Single Image Super-resolution (CVPR-2019) "Second-order Attention Network for Single Image Super-resolution" is pub

516 Dec 28, 2022
Semi-supervised Implicit Scene Completion from Sparse LiDAR

Semi-supervised Implicit Scene Completion from Sparse LiDAR Paper Created by Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou and YA-QIN ZH

114 Nov 30, 2022
Training BERT with Compute/Time (Academic) Budget

Training BERT with Compute/Time (Academic) Budget This repository contains scripts for pre-training and finetuning BERT-like models with limited time

Intel Labs 263 Jan 07, 2023
Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Segmentation from Natural Language Expressions This repository contains the code for the following paper: R. Hu, M. Rohrbach, T. Darrell, Segmentation

Ronghang Hu 88 May 24, 2022
Zeyuan Chen, Yangchao Wang, Yang Yang and Dong Liu.

Principled S2R Dehazing This repository contains the official implementation for PSD Framework introduced in the following paper: PSD: Principled Synt

zychen 78 Dec 30, 2022
Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection

Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection Introduction This repository includes codes and models of "Effect of De

Amir Abbasi 5 Sep 05, 2022
Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.

Deep-Unsupervised-Domain-Adaptation Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E.

Alan Grijalva 49 Dec 20, 2022
The code for SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network.

SAG-DTA The code is the implementation for the paper 'SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network'. Requirements py

Shugang Zhang 7 Aug 02, 2022
Code accompanying the paper "Knowledge Base Completion Meets Transfer Learning"

Knowledge Base Completion Meets Transfer Learning This code accompanies the paper Knowledge Base Completion Meets Transfer Learning published at EMNLP

14 Nov 27, 2022
Annealed Flow Transport Monte Carlo

Annealed Flow Transport Monte Carlo Open source implementation accompanying ICML 2021 paper by Michael Arbel*, Alexander G. D. G. Matthews* and Arnaud

DeepMind 30 Nov 21, 2022
An open-source Deep Learning Engine for Healthcare that aims to treat & prevent major diseases

AlphaCare Background AlphaCare is a work-in-progress, open-source Deep Learning Engine for Healthcare that aims to treat and prevent major diseases. T

Siraj Raval 44 Nov 05, 2022