Contrastive Learning for Neural Topic Model

This repository contains the implementation of the paper Contrastive Learning for Neural Topic Model.

Thong Nguyen, Luu Anh Tuan (NeurIPS 2021)

In this work, we target the problem of capturing meaningful representations through modeling the relations among samples from a mathematical perspective and propose a novel contrastive objective to train the neural topic model, along with the optimization of the variational lower bound. In our contrastive learning framework, we introduce a novel sampling strategy that is motivated by human behavior when comparing numerous documents. Our results show that capturing mutual information between the prototype and its positive sample provides a strong foundation for constructing coherent topics, while differentiating the prototype from the negative samples plays a less fundamental role.

@inproceedings{
nguyen2021contrastive,
title={Contrastive Learning for Neural Topic Model},
author={Thong Thanh Nguyen and Anh Tuan Luu},
booktitle={Advances in Neural Information Processing Systems},
editor={A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
year={2021},
url={https://openreview.net/forum?id=NEgqO9yB7e}
}

Requirements

python3
pandas
gensim
numpy
torchvision
pytorch 1.7.0
scipy

How to Run

Download and put the dataset in the data folder: https://drive.google.com/file/d/1JeeUCzBRQqJUvdWGDN7aMRvIoBAIbZIc/view?usp=sharing
Train the model by running ./scripts/train_models/run_{dataset}_{topk}.sh
Evaluate the model via executing ./scripts/evaluate/run_{dataset}_npmi_{topk}.sh

Acknowledgement

Our implementation is based on the official code of SCHOLAR.

CLNTM - Contrastive Learning for Neural Topic Model

Related tags

Overview

Contrastive Learning for Neural Topic Model

Requirements

How to Run

Acknowledgement

Owner

Thong Thanh Nguyen

Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

A nutritional label for food for thought.

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

Flask101 - FullStack Web Development with Python & JS - From TAQWA

A Deep Learning Framework for Neural Derivative Hedging

Classification Modeling: Probability of Default

basic tutorial on pytorch

“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

🤗 Push your spaCy pipelines to the Hugging Face Hub

CVPR2021 Workshop - HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization.

An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym

OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Multi-Horizon-Forecasting-for-Limit-Order-Books

Multiple Object Extraction from Aerial Imagery with Convolutional Neural Networks

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

Create images and texts with the First Order Generative Adversarial Networks

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Recovering Brain Structure Network Using Functional Connectivity