This repository is for our EMNLP 2021 paper "Automated Generation of Accurate & Fluent Medical X-ray Reports"

Overview

Introduction: X-Ray Report Generation

This repository is for our EMNLP 2021 paper "Automated Generation of Accurate & Fluent Medical X-ray Reports". Our work adopts x-ray (also including some history data for patients if there are any) as input, a CNN is used to learn the embedding features for x-ray, as a result, disease-state-style information (Previously, almost all work used detected disease embedding for input of text generation network which could possibly exclude the false negative diseases) is extracted and fed into the text generation network (transformer). To make sure the consistency of detected diseases and generated x-ray reports, we also create a interpreter to enforce the accuracy of the x-ray reports. For details, please refer to here.

Data we used for experiments

We use two datasets for experiments to validate our method:

Performance on two datasets

Datasets Methods BLEU-1 BLEU-2 BLEU-3 BLEU-4 METEOR ROUGE-L
Open-I Single-view 0.463 0.310 0.215 0.151 0.186 0.377
Multi-view 0.476 0.324 0.228 0.164 0.192 0.379
Multi-view w/ Clinical History 0.485 0.355 0.273 0.217 0.205 0.422
Full Model (w/ Interpreter) 0.515 0.378 0.293 0.235 0.219 0.436
MIMIC Single-view 0.447 0.290 0.200 0.144 0.186 0.317
Multi-view 0.451 0.292 0.201 0.144 0.185 0.320
Multi-view w/ Clinical History 0.491 0.357 0.276 0.223 0.213 0.389
Full Model (w/ Interpreter) 0.495 0.360 0.278 0.224 0.222 0.390

Environments for running codes

  • Operating System: Ubuntu 18.04

  • Hardware: tested with RTX 2080 TI (11G)

  • Software: tested with PyTorch 1.5.1, Python3.7, CUDA 10.0, tensorboardX, tqdm

  • Anaconda is strongly recommended

  • Other Libraries: Spacy, SentencePiece, nlg-eval

How to use our code for train/test

Step 0: Build your vocabulary model with SentencePiece (tools/vocab_builder.py)

  • Please make sure that you have preprocess the medical reports accurately.
  • We use the top 900 high-frequency words
  • We use 100 unigram tokens extracted from SentencePiece to avoid the out-of-vocabulary situation.
  • In total we have 1000 words and tokens. Update: You can skip step 0 and use the vocabulary files in Vocabulary/*.model

Step 1: Train the LSTM and/or Transformer models, which are just text classifiers, to obtain 14 common disease labels.

  • Use the train_text.py to train the models on your working datasets. For example, the MIMIC-CXR comes with CheXpert labels; you can use these labels as ground-truth to train a differentiable text classifier model. Here the text classifier is a binary predictor (postive/uncertain) = 1 and (negative/unmentioned) = 0.
  • Assume the trained text classifier is perfect and exactly reflects the medical reports. Although this is not the case, in practice, it gives us a good approximation of how good the generated reports are. Human evaluation is also needed to evalutate the generated reports.
  • The goals here are:
  1. Evaluate the performance of the generated reports by comparing the predicted labels and the ground-truth labels.
  2. Use the trained models to fine-tune medical reports' output.

Step 2: Test the text classifier models using the train_text.py with:

  • PHASE = 'TEST'
  • RELOAD = True --> Load the trained models for testing

Step 3: Transfer the trained model to obtain 14 common disease labels for the Open-I datasets and any dataset that doesn't have ground-truth labels.

  • Transfer the learned model to the new dataset by predicting 14 disease labels for the entire dataset by running extract_label.py on the target dataset. The output file is file2label.json
  • Split them into train, validation, and test sets (we have already done that for you, just put the file2label.json in a place where the NLMCXR dataset can see).
  • Build your own text classifier (train_text.py) based on the extracted disease labels (treat them as ground-truth labels).
  • In the end, we want the text classifiers (LSTM/Transformer) to best describe your model's output on the working dataset.

Step 4: Get additional labels using (tools/count_nounphrases.py)

  • Note that 14 disease labels are not enough to generate accurate reports. This is because for the same disease, we might have different ways to express it. For this reason, additional labels are needed to enhance the quality of medical reports.
  • The output of the coun_nounphrases.py is a json file, you can use it as input to the exising datasets such as MIMIC or NLMCXR.
  • Therefore, in total we have 14 disease labels + 100 noun-phrases = 114 disease-related topics/labels. Please check the appendix in our paper.

Step 5: Train the ClsGen model (Classifier-Generator) with train_full.py

  • PHASE = 'TRAIN'
  • RELOAD = False --> We trained our model from scratch

Step 6: Train the ClsGenInt model (Classifier-Generator-Interpreter) with train_full.py

  • PHASE = 'TRAIN'
  • RELOAD = True --> Load the ClsGen trained from the step 4, load the Interpreter model from Step 1 or 3
  • Reduce the learning rate --> Since the ClsGen has already converged, we need to reduce the learning rate to fine-tune the word representation such that it minimize the interpreter error.

Step 7: Generate the outputs

  • Use the infer function in the train_full.py to generate the outputs. This infer function ensures that no ground-truth labels and medical reports are being used in the inference phase (we used teacher forcing / ground-truth labels during training phase).
  • Also specify the threshold parameter, see the appendix of our paper on which threshold to choose from.
  • Final specify your the name of your output files.

Step 8: Evaluate the generated reports.

  • Use the trained text classifier model in step 1 to evaluate the clinical accuracy
  • Use the nlg-eval library to compute BLEU-1 to BLEU-4 scores and other metrics.

Our pretrained models

Our model is uploaded in google drive, please download the model from

Model Name Download Link
Our Model for MIMIC Google Drive
Our Model for NLMCXR Google Drive

Citation

If it is helpful to you, please cite our work:

@inproceedings{nguyen-etal-2021-automated,
    title = "Automated Generation of Accurate {\&} Fluent Medical {X}-ray Reports",
    author = "Nguyen, Hoang  and
      Nie, Dong  and
      Badamdorj, Taivanbat  and
      Liu, Yujie  and
      Zhu, Yingying  and
      Truong, Jason  and
      Cheng, Li",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.288",
    doi = "10.18653/v1/2021.emnlp-main.288",
    pages = "3552--3569",
}

Owner
no name
no name
This repository contains the source code for the paper Tutorial on amortized optimization for learning to optimize over continuous domains by Brandon Amos

Tutorial on Amortized Optimization This repository contains the source code for the paper Tutorial on amortized optimization for learning to optimize

Meta Research 144 Dec 26, 2022
Face Alignment using python

Face Alignment Face Alignment using python Input Image Aligned Face Aligned Face Aligned Face Input Image Aligned Face Input Image Aligned Face Instal

Sajjad Aemmi 28 Nov 23, 2022
Vision-and-Language Navigation in Continuous Environments using Habitat

Vision-and-Language Navigation in Continuous Environments (VLN-CE) Project Website — VLN-CE Challenge — RxR-Habitat Challenge Official implementations

Jacob Krantz 132 Jan 02, 2023
An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

This is the code for the paper: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [CVPR PDF] [Journal PDF] J

226 Nov 05, 2022
BlockUnexpectedPackets - Preventing BungeeCord CPU overload due to Layer 7 DDoS attacks by scanning BungeeCord's logs

BlockUnexpectedPackets This script automatically blocks DDoS attacks that are sp

SparklyPower 3 Mar 31, 2022
ScriptProfilerPy - Module to visualize where your python script is slow

ScriptProfiler helps you track where your code is slow It provides: Code lines t

Lucas BLP 3 Jun 02, 2022
ICLR2021 (Under Review)

Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning This repository contains the official PyTorch implementation o

Haoyi Fan 58 Dec 30, 2022
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

STAM - Pytorch Implementation of STAM (Space Time Attention Model), yet another pure and simple SOTA attention model that bests all previous models in

Phil Wang 109 Dec 28, 2022
atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

#11 atmaCup 2021-07-09 ~ 2020-07-21 に行われた #11 [初心者歓迎! / 画像編] atmaCup のリポジトリです。結果は Public 4th / Private 5th でした。 フレームワークは PyTorch で、実装は pytorch-image-m

Tawara 12 Apr 07, 2022
State-to-Distribution (STD) Model

State-to-Distribution (STD) Model In this repository we provide exemplary code on how to construct and evaluate a state-to-distribution (STD) model fo

<a href=[email protected]"> 2 Apr 07, 2022
A powerful framework for decentralized federated learning with user-defined communication topology

Scatterbrained Decentralized Federated Learning Scatterbrained makes it easy to build federated learning systems. In addition to traditional federated

Johns Hopkins Applied Physics Laboratory 7 Sep 26, 2022
SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summary.

FIW Data Development Kit Table of Contents Introduction Families In the Wild Database Publications Organization To Do License Getting Involved Introdu

Joseph P. Robinson 12 Jun 04, 2022
RefineGNN - Iterative refinement graph neural network for antibody sequence-structure co-design (RefineGNN)

Iterative refinement graph neural network for antibody sequence-structure co-des

Wengong Jin 83 Dec 31, 2022
Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable.

Diffrax Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable. Diffrax is a JAX-based library providing numerical differe

Patrick Kidger 717 Jan 09, 2023
The repository for freeCodeCamp's YouTube course, Algorithmic Trading in Python

Algorithmic Trading in Python This repository Course Outline Section 1: Algorithmic Trading Fundamentals What is Algorithmic Trading? The Differences

Nick McCullum 1.8k Jan 02, 2023
SAT Project - The first project I had done at General Assembly, performed EDA, data cleaning and created data visualizations

Project 1: Standardized Test Analysis by Adam Klesc Overview This project covers: Basic statistics and probability Many Python programming concepts Pr

Adam Muhammad Klesc 1 Jan 03, 2022
This is a work in progress reimplementation of Instant Neural Graphics Primitives

Neural Hash Encoding This is a work in progress reimplementation of Instant Neural Graphics Primitives Currently this can train an implicit representa

Penn 79 Sep 01, 2022
Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert

Akash Sengupta 149 Dec 14, 2022
Code for "Continuous-Time Meta-Learning with Forward Mode Differentiation" (ICLR 2022)

Continuous-Time Meta-Learning with Forward Mode Differentiation ICLR 2022 (Spotlight) - Installation - Example - Citation This repository contains the

Tristan Deleu 25 Oct 20, 2022
Аналитика доходности инвестиционного портфеля в Тинькофф брокере

Аналитика доходности инвестиционного портфеля Тиньков Видео на YouTube Для работы скрипта нужно установить три переменных окружения: export TINKOFF_TO

Alexey Goloburdin 64 Dec 17, 2022