This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Last update: Jan 11, 2022

Related tags

Deep Learning text-representation

Overview

Introduction

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

If you find this code useful, please cite the following paper:

@article{tan2022coherence,
  title = {Coherence-Based Distributed Document Representation Learning for Scientific Documents},
  author = {Tan, Shicheng and Zhao, Shu and Zhang, Yanping},
  journal = {arXiv},
  year = {2022},
  type = {Journal Article}
}

Run

Installation environment (ref. requirements.txt)
Download data: Link: https://pan.baidu.com/s/1EEJk0_P55Ov5ReXsmyVZPA Password: rkh0
python _av_CTE.py

信息检索数据运行指南

数据处理（4个文件）：使用“...data helper-IR.py”获取3份数据，原始数据处理暂存文件、原始数据处理暂存文件的语料、构建的数据集，然后使用“_aj_get dataset corpus.py”获得构建的数据集的语料
词向量训练（4个文件）：使用“_ak_get word embedding.py”训练第一步的2个语料得到2个词表和2个词向量文件，glove需要去除后缀名“.txt”
运行5次“_al_em-avg.py”得到5个结果，avg-word2vec、avg-word2vec(globe)、avg-glove、avg-glove(globe)、random embedding
运行“_ac_tf-idf.py”得到一个距离矩阵和1个结果，矩阵用于CTE方法
LDA、doc2vec、BM25、LSI、GPT2、XLNet、GPT、Transformer-XL、XLM 对应文件各运行一次得到9个结果
运行“_ah_WMD.py”4次得到4个结果，WMD-word2vec、WMD-word2vec(globe)、WMD-glove、WMD-glove(globe)
运行“_at_BERT.py”2次得到2个结果，BERT-Large uncased、BERT-Large uncased(wwm)
运行“_at_ELMo.py”2次得到2个结果，ELMo-Original(5.5B)、ELMo-Original(5.5B,级联)
运行“_av_CET.py”13次得到13个结果，基于 random embedding 等13种基础词向量

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Related tags

Overview

Introduction

Run

信息检索数据运行指南

Owner

tsc

We utilize deep reinforcement learning to obtain favorable trajectories for visual-inertial system calibration.

learned_optimization: Training and evaluating learned optimizers in JAX

BABEL: Bodies, Action and Behavior with English Labels [CVPR 2021]

Experiments with differentiable stacks and queues in PyTorch

An efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning"

Car Parking Tracker Using OpenCv

Repository for the paper "From global to local MDI variable importances for random forests and when they are Shapley values"

Context-Sensitive Misspelling Correction of Clinical Text via Conditional Independence, CHIL 2022

Text Summarization - WCN — Weighted Contextual N-gram method for evaluation of Text Summarization

A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION

Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

An Unbiased Learning To Rank Algorithms (ULTRA) toolbox

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

Official PyTorch Implementation of Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition, ICCV 2021

The Official Repository for "Generalized OOD Detection: A Survey"

Diverse Branch Block: Building a Convolution as an Inception-like Unit

Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

The repository offers the official implementation of our BMVC 2021 paper in PyTorch.

Learning Tracking Representations via Dual-Branch Fully Transformer Networks