code for modular summarization work published in ACL2021 by Krishna et al

Last update: Nov 24, 2022

Related tags

Overview

This repository contains the code for running modular summarization pipelines as described in the publication
Krishna K, Khosla K, Bigham J, Lipton ZC. Generating SOAP Notes from Doctor-Patient Conversations." ACL 2021.

Instructions

Although we can not release models trained on the confidential medical data, we have released models trained on the publicly available AMI dataset.
To reproduce the results on the AMI dataset, you need to follow the steps listed below. For convenience, we have also created a Google Colab notebook here that runs these steps on Google's servers (free-of-cost as of June 2021) and produces the summaries and their rouge scores.

Step1: Set up the environment by installing the required packages mentioned in requirements.txt using pip.

Step2: Download the ami_models folder from this link and put it at the root of the repository:

Step3: Run the following 3 commands to prepare data, run summary generation pipelines, and show the achieved rouge scores.

# command1: downloads and preprocesses AMI dataset  
./prepare_data.sh  
  
 # command2: runs the summarization pipelines on the data and computes rouge scores  
 # (before running this command, you need to download the models as shown above)  
./predict_ami.sh  
  
# command3: print the results  
python show_results.py

code for modular summarization work published in ACL2021 by Krishna et al

Related tags

Overview

Instructions

Owner

Approximately Correct Machine Intelligence (ACMI) Lab

Official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.

Programme de chiffrement et de déchiffrement inverse d'un message en python3.

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision

Sentence Embeddings with BERT & XLNet

Residual2Vec: Debiasing graph embedding using random graphs

The entmax mapping and its loss, a family of sparse softmax alternatives.

(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

lightweight, fast and robust columnar dataframe for data analytics with online update

Module for automatic summarization of text documents and HTML pages.

An Explainable Leaderboard for NLP

Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

Club chatbot

Trex is a tool to match semantically similar functions based on transfer learning.

Machine translation models released by the Gourmet project

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

List of GSoC organisations with number of times they have been selected.

Convolutional 2D Knowledge Graph Embeddings resources

Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP)

A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型，适用于英语、普通话/中文、日语、韩语、俄语和藏语（当前已测试）。