A Japanese Medical Information Extraction Toolkit

Last update: Dec 12, 2022

Related tags

Deep Learning JaMIE

Overview

JaMIE: a Japanese Medical Information Extraction toolkit

Joint Japanese Medical Problem, Modality and Relation Recognition

The Train/Test phrases require all train, dev, test file converted to CONLL-style. Please check data_converter.py

Installation (python3.8)

git clone https://github.com/racerandom/JaMIE.git
cd JaMIE \

Required python package

pip install -r requirements.txt

Mophological analyzer required:\

jumanpp
mecab (juman-dict)

Pretrained BERT required:\

NICT-BERT (NICT_BERT-base_JapaneseWikipedia_32K_BPE)

Train：

CUDA_VISIBLE_DEVICES=$SEED python clinical_joint.py \
--pretrained_model $PRETRAINED_BERT \
--train_file $TRAIN_FILE \
--dev_file $DEV_FILE \
--dev_output $DEV_OUT \
--saved_model $MODEL_DIR_TO_SAVE \
--enc_lr 2e-5 \
--batch_size 4 \
--warmup_epoch 2 \
--num_epoch 20 \
--do_train
--fp16 (apex required)

The models trained on radiography interpretation reports of Lung Cancer (LC) and general medical reports of Idiopathic Pulmonary Fibrosis (IPF) are to be availabel: link1, link2.

Test:

CUDA_VISIBLE_DEVICES=$SEED python clinical_joint.py \
--saved_model $SAVED_MODEL \
--test_file $TEST_FILE \
--test_output $TEST_OUT \
--batch_size 4

Bath Converter from XML (or raw text) to CONLL for Train/Test

Convert XML files to CONLL files for Train/Test. You can also convert raw text to CONLL-style for Test.

python data_converter.py \
--mode xml2conll \
--xml $XML_FILES_DIR \
--conll $OUTPUT_CONLL_DIR \
--cv_num 5 \ # 5-fold cross-validation, 0 presents to generate single conll file
--doc_level \ # generate document-level ([SEP] denotes sentence boundaries) or sentence-level conll files
--segmenter mecab \ # please use mecab and NICT bert currently
--bert_dir $PRETRAINED_BERT

Batch Converter from predicted CONLL to XML

python data_converter.py \
--mode conll2xml \
--xml $XML_FILES_DIR \
--conll $OUTPUT_CONLL_DIR

Citation

If you use our code in your research, please cite our work:

@inproceedings{cheng2021jamie,
   title={JaMIE: A Pipeline Japanese Medical Information Extraction System,
   author={Fei Cheng, Shuntaro Yada, Ribeka Tanaka, Eiji Aramaki, Sadao Kurohashi},
   booktitle={arXiv},
   year={2021}
}

A Japanese Medical Information Extraction Toolkit

Related tags

Overview

JaMIE: a Japanese Medical Information Extraction toolkit

Joint Japanese Medical Problem, Modality and Relation Recognition

Installation (python3.8)

Required python package

Mophological analyzer required:\

Pretrained BERT required:\

Train：

Test:

Bath Converter from XML (or raw text) to CONLL for Train/Test

Batch Converter from predicted CONLL to XML

Citation

Owner

A note taker for NVDA. Allows the user to create, edit, view, manage and export notes to different formats.

DeepOBS: A Deep Learning Optimizer Benchmark Suite

OpenFed: A Comprehensive and Versatile Open-Source Federated Learning Framework

The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

CLIP (Contrastive Language–Image Pre-training) for Italian

Semi-Supervised Learning for Fine-Grained Classification

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

The modify PyTorch version of Siam-trackers which are speed-up by TensorRT.

TigerLily: Finding drug interactions in silico with the Graph.

Official repository of Semantic Image Matting

Official Repsoitory for "Mish: A Self Regularized Non-Monotonic Neural Activation Function" [BMVC 2020]

Unsupervised clustering of high content screen samples

A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

MTA:SA Server Configer.

Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

Self-Adaptable Point Processes with Nonparametric Time Decays

Converting CPT to bert form for use

Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

A pytorch reprelication of the model-based reinforcement learning algorithm MBPO