Contrastive Learning for Neural Topic Model

This repository contains the implementation of the paper Contrastive Learning for Neural Topic Model.

Thong Nguyen, Luu Anh Tuan (NeurIPS 2021)

In this work, we target the problem of capturing meaningful representations through modeling the relations among samples from a mathematical perspective and propose a novel contrastive objective to train the neural topic model, along with the optimization of the variational lower bound. In our contrastive learning framework, we introduce a novel sampling strategy that is motivated by human behavior when comparing numerous documents. Our results show that capturing mutual information between the prototype and its positive sample provides a strong foundation for constructing coherent topics, while differentiating the prototype from the negative samples plays a less fundamental role.

@inproceedings{
nguyen2021contrastive,
title={Contrastive Learning for Neural Topic Model},
author={Thong Thanh Nguyen and Anh Tuan Luu},
booktitle={Advances in Neural Information Processing Systems},
editor={A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
year={2021},
url={https://openreview.net/forum?id=NEgqO9yB7e}
}

Requirements

python3
pandas
gensim
numpy
torchvision
pytorch 1.7.0
scipy

How to Run

Download and put the dataset in the data folder: https://drive.google.com/file/d/1JeeUCzBRQqJUvdWGDN7aMRvIoBAIbZIc/view?usp=sharing
Train the model by running ./scripts/train_models/run_{dataset}_{topk}.sh
Evaluate the model via executing ./scripts/evaluate/run_{dataset}_npmi_{topk}.sh

Acknowledgement

Our implementation is based on the official code of SCHOLAR.

CLNTM - Contrastive Learning for Neural Topic Model

Related tags

Overview

Contrastive Learning for Neural Topic Model

Requirements

How to Run

Acknowledgement

Owner

Thong Thanh Nguyen

A tool to prepare websites grabbed with wget for local viewing.

A Python library for differentiable optimal control on accelerators.

Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

CenterNet:Objects as Points目标检测模型在Pytorch当中的实现

This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

This repository provides some of the code implemented and the data used for the work proposed in "A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems".

Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing (CVPR 2018).

Music Generation using Neural Networks Streamlit App

An efficient and easy-to-use deep learning model compression framework

A hue shift helper for OBS

PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

MutualGuide is a compact object detector specially designed for embedded devices

InvTorch: memory-efficient models with invertible functions

Simple Linear 2nd ODE Solver GUI - A 2nd constant coefficient linear ODE solver with simple GUI using euler's method

scalingscattering

Council-GAN - Implementation for our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020)

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

Cookiecutter PyTorch Lightning

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".