This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Last update: Jan 11, 2022

Related tags

Deep Learning text-representation

Overview

Introduction

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

If you find this code useful, please cite the following paper:

@article{tan2022coherence,
  title = {Coherence-Based Distributed Document Representation Learning for Scientific Documents},
  author = {Tan, Shicheng and Zhao, Shu and Zhang, Yanping},
  journal = {arXiv},
  year = {2022},
  type = {Journal Article}
}

Run

Installation environment (ref. requirements.txt)
Download data: Link: https://pan.baidu.com/s/1EEJk0_P55Ov5ReXsmyVZPA Password: rkh0
python _av_CTE.py

信息检索数据运行指南

数据处理（4个文件）：使用“...data helper-IR.py”获取3份数据，原始数据处理暂存文件、原始数据处理暂存文件的语料、构建的数据集，然后使用“_aj_get dataset corpus.py”获得构建的数据集的语料
词向量训练（4个文件）：使用“_ak_get word embedding.py”训练第一步的2个语料得到2个词表和2个词向量文件，glove需要去除后缀名“.txt”
运行5次“_al_em-avg.py”得到5个结果，avg-word2vec、avg-word2vec(globe)、avg-glove、avg-glove(globe)、random embedding
运行“_ac_tf-idf.py”得到一个距离矩阵和1个结果，矩阵用于CTE方法
LDA、doc2vec、BM25、LSI、GPT2、XLNet、GPT、Transformer-XL、XLM 对应文件各运行一次得到9个结果
运行“_ah_WMD.py”4次得到4个结果，WMD-word2vec、WMD-word2vec(globe)、WMD-glove、WMD-glove(globe)
运行“_at_BERT.py”2次得到2个结果，BERT-Large uncased、BERT-Large uncased(wwm)
运行“_at_ELMo.py”2次得到2个结果，ELMo-Original(5.5B)、ELMo-Original(5.5B,级联)
运行“_av_CET.py”13次得到13个结果，基于 random embedding 等13种基础词向量

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Related tags

Overview

Introduction

Run

信息检索数据运行指南

Owner

tsc

PyTorch implementations of algorithms for density estimation

Deep Face Recognition in PyTorch

A rule-based log analyzer & filter

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

PyTorch implementation of Super SloMo by Jiang et al.

【CVPR 2021, Variational Inference Framework, PyTorch】 From Rain Generation to Rain Removal

Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

百度2021年语言与智能技术竞赛机器阅读理解Pytorch版baseline

Machine Learning Framework for Operating Systems - Brings ML to Linux kernel

AirLoop: Lifelong Loop Closure Detection

Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

Modeling CNN layers activity with Gaussian mixture model

Source code for "OmniPhotos: Casual 360° VR Photography"

Official implementation of Sparse Transformer-based Action Recognition

Auto grind btdb2 exp for tower

DM-ACME compatible implementation of the Arm26 environment from Mujoco

The sixth place winning solution (6/220) in 2021 Gaofen Challenge.

Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

TiP-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling