CLNTM - Contrastive Learning for Neural Topic Model

Related tags

Deep LearningCLNTM
Overview

Contrastive Learning for Neural Topic Model

This repository contains the implementation of the paper Contrastive Learning for Neural Topic Model.

Thong Nguyen, Luu Anh Tuan (NeurIPS 2021)

Teaser image In this work, we target the problem of capturing meaningful representations through modeling the relations among samples from a mathematical perspective and propose a novel contrastive objective to train the neural topic model, along with the optimization of the variational lower bound. In our contrastive learning framework, we introduce a novel sampling strategy that is motivated by human behavior when comparing numerous documents. Our results show that capturing mutual information between the prototype and its positive sample provides a strong foundation for constructing coherent topics, while differentiating the prototype from the negative samples plays a less fundamental role.

@inproceedings{
nguyen2021contrastive,
title={Contrastive Learning for Neural Topic Model},
author={Thong Thanh Nguyen and Anh Tuan Luu},
booktitle={Advances in Neural Information Processing Systems},
editor={A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
year={2021},
url={https://openreview.net/forum?id=NEgqO9yB7e}
}

Requirements

  • python3
  • pandas
  • gensim
  • numpy
  • torchvision
  • pytorch 1.7.0
  • scipy

How to Run

  1. Download and put the dataset in the data folder: https://drive.google.com/file/d/1JeeUCzBRQqJUvdWGDN7aMRvIoBAIbZIc/view?usp=sharing
  2. Train the model by running ./scripts/train_models/run_{dataset}_{topk}.sh
  3. Evaluate the model via executing ./scripts/evaluate/run_{dataset}_npmi_{topk}.sh

Acknowledgement

Our implementation is based on the official code of SCHOLAR.

Owner
Thong Thanh Nguyen
...
Thong Thanh Nguyen
Split your patch similarly to `git add -p` but supporting multiple buckets

split-patch.py This is git add -p on steroids for patches. Given a my.patch you can run ./split-patch.py my.patch You can choose in which bucket to p

102 Oct 06, 2022
Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

Alex Robey 16 Sep 19, 2022
Facestar dataset. High quality audio-visual recordings of human conversational speech.

Facestar Dataset Description Existing audio-visual datasets for human speech are either captured in a clean, controlled environment but contain only a

Meta Research 87 Dec 21, 2022
Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.

Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.

beringresearch 285 Jan 04, 2023
A basic reminder tool written in Python.

A simple Python Reminder Here's a basic reminder tool written in Python that speaks to the user and sends a notification. Run pip3 install pyttsx3 w

Sachit Yadav 4 Feb 05, 2022
bio_inspired_min_nets_improve_the_performance_and_robustness_of_deep_networks

Code Submission for: Bio-inspired Min-Nets Improve the Performance and Robustness of Deep Networks Run with docker To build a docker environment, chan

0 Dec 09, 2021
A python interface for training Reinforcement Learning bots to battle on pokemon showdown

The pokemon showdown Python environment A Python interface to create battling pokemon agents. poke-env offers an easy-to-use interface for creating ru

Haris Sahovic 184 Dec 30, 2022
clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

README clDice - a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation CVPR 2021 Authors: Suprosanna Shit and Johannes C. Paetzo

110 Dec 29, 2022
PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

hierarchical-multi-label-text-classification-pytorch Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach This

Mingu Kang 17 Dec 13, 2022
LBK 26 Dec 28, 2022
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

Jungbeom Lee 110 Dec 07, 2022
Official Implementation of DE-DETR and DELA-DETR in "Towards Data-Efficient Detection Transformers"

DE-DETRs By Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, and Dacheng Tao This repository is an official implementation of DE-DETR and DELA-DETR in

Wen Wang 61 Dec 12, 2022
wlad 2 Dec 19, 2022
[ICCV '21] In this repository you find the code to our paper Keypoint Communities

Keypoint Communities In this repository you will find the code to our ICCV '21 paper: Keypoint Communities Duncan Zauss, Sven Kreiss, Alexandre Alahi,

Duncan Zauss 262 Dec 13, 2022
Visual odometry package based on hardware-accelerated NVIDIA Elbrus library with world class quality and performance.

Isaac ROS Visual Odometry This repository provides a ROS2 package that estimates stereo visual inertial odometry using the Isaac Elbrus GPU-accelerate

NVIDIA Isaac ROS 343 Jan 03, 2023
Gender Classification Machine Learning Model using Sk-learn in Python with 97%+ accuracy and deployment

Gender-classification This is a ML model to classify Male and Females using some physical characterstics Data. Python Libraries like Pandas,Numpy and

Aryan raj 11 Oct 16, 2022
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 03, 2023
StyleGAN2-ada for practice

This version of the newest PyTorch-based StyleGAN2-ada is intended mostly for fellow artists, who rarely look at scientific metrics, but rather need a working creative tool. Tested on Python 3.7 + Py

vadim epstein 170 Nov 16, 2022
Codes for "Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier"

Deep-RTC [project page] This repository contains the source code accompanying our ECCV 2020 paper. Solving Long-tailed Recognition with Deep Realistic

Gina Wu 16 May 26, 2022
Code for STFT Transformer used in BirdCLEF 2021 competition.

STFT_Transformer Code for STFT Transformer used in BirdCLEF 2021 competition. The STFT Transformer is a new way to use Transformers similar to Vision

Jean-François Puget 69 Sep 29, 2022