This repo stores the codes for topic modeling on palliative care journals.

Last update: Dec 20, 2022

Overview

This repo stores the codes for topic modeling on palliative care journals.

Data Preparation

You first need to download the journal papers.

bash 1_download_pdfs.sh # To download papers from the journal `jpm`
bash 1_download_pdfs_jwh.sh # To download papers from the journal `jwh`

Environment Setup

Install all the necessary python packages.

bash 2_pdf2json_prep_env.sh

How to Run the Topic Model

Run the topic modeling on the default journal jpm:

python 3_pdf2json2topics.py

Or you can also run the topic modeling on the other journal jwh:

python 3_pdf2json2topics.py -journal_name jwh

Owner

PhD in NLP & Causality. Affiliated with Max Planck Institute, Germany & ETH & UMich. Supervised by Bernhard Schoelkopf, Rada Mihalcea, and Mrinmaya Sachan.

GitHub Repository

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

TFPNER TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech Named entity recognition (NER), which aims at identifyin

1 Feb 07, 2022

TruthfulQA: Measuring How Models Imitate Human Falsehoods

69 Dec 25, 2022

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）此版本基于Pytorch后端的huggingface进行实现。由于此实现使用了Oneflow的dataloader作为数据读入的方式，因此也需要安装Oneflow。其它框架的数据读取可以参考OneflowDataloade

9 Oct 13, 2022

Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition

SEW (Squeezed and Efficient Wav2vec) The repo contains the code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speec

67 Dec 01, 2022

PyTorch source code of NAACL 2019 paper "An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models"

This repository contains source code for NAACL 2019 paper "An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models" (P

89 Aug 12, 2022

A programming language with logic of Python, and syntax of all languages.

Pytov The idea was to take all well known syntaxes, and combine them into one programming language with many posabilities. Installation Install using

14 Dec 07, 2022

NLTK Source

Natural Language Toolkit (NLTK) NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets, and tutorials supporting

11.4k Jan 04, 2023

Pretrained Japanese BERT models

Pretrained Japanese BERT models This is a repository of pretrained Japanese BERT models. The models are available in Transformers by Hugging Face. Mod

387 Dec 30, 2022

This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2

GPT-2 Catalan playground and scripts to train a GPT-2 model either from scrath or from another pretrained model.

1 Jan 28, 2022

AI-Broad-casting - AI Broad casting with python

Basic Code 1. Use The Code Configuration Environment conda create -n code_base p

1 Jan 04, 2022

NLP command-line assistant powered by OpenAI

16 Dec 09, 2022

Facilitating the design, comparison and sharing of deep text matching models.

MatchZoo Facilitating the design, comparison and sharing of deep text matching models. MatchZoo 是一个通用的文本匹配工具包，它旨在方便大家快速的实现、比较、以及分享最新的深度文本匹配模型。 🔥 News

3.7k Jan 02, 2023

A Python/Pytorch app for easily synthesising human voices

Voice Cloning App A Python/Pytorch app for easily synthesising human voices Documentation Discord Server Video guide Voice Sharing Hub FAQ's System Re

840 Jan 04, 2023

TPlinker for NER 中文/英文命名实体识别

本项目是参考 TPLinker 中HandshakingTagging思想，将TPLinker由原来的关系抽取(RE)模型修改为命名实体识别(NER)模型。

113 Dec 28, 2022

Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch

COCO LM Pretraining (wip) Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch. They were a

44 Jul 28, 2022

This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"

Splinter This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection", to

88 Dec 31, 2022

This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Word-Level Coreference Resolution This is a repository with the code to reproduce the experiments described in the paper of the same name, which was a

79 Dec 27, 2022

This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

NORESQA: Speech Quality Assessment using Non-Matching References This is a Pytorch implementation for using NORESQA. It contains minimal code to predi

36 Dec 08, 2022

Modified GPT using average pooling to reduce the softmax attention memory constraints.

NLP-GPT-Upsampling This repository contains an implementation of Open AI's GPT Model. In particular, this implementation takes inspiration from the Ny

1 Dec 03, 2021

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive. Obstacles like sentence negation, sarcasm, terseness, language ambiguity, and many others

1 Jan 13, 2022

This repo stores the codes for topic modeling on palliative care journals.

Related tags

Overview

Data Preparation

Environment Setup

How to Run the Topic Model

Owner

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

TruthfulQA: Measuring How Models Imitate Human Falsehoods

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition

PyTorch source code of NAACL 2019 paper "An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models"

A programming language with logic of Python, and syntax of all languages.

NLTK Source

Pretrained Japanese BERT models

This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2

AI-Broad-casting - AI Broad casting with python

NLP command-line assistant powered by OpenAI

Facilitating the design, comparison and sharing of deep text matching models.

A Python/Pytorch app for easily synthesising human voices

TPlinker for NER 中文/英文命名实体识别

Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch

This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"

This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

Modified GPT using average pooling to reduce the softmax attention memory constraints.

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive