A Japanese Medical Information Extraction Toolkit

Last update: Dec 12, 2022

Related tags

Deep Learning JaMIE

Overview

JaMIE: a Japanese Medical Information Extraction toolkit

Joint Japanese Medical Problem, Modality and Relation Recognition

The Train/Test phrases require all train, dev, test file converted to CONLL-style. Please check data_converter.py

Installation (python3.8)

git clone https://github.com/racerandom/JaMIE.git
cd JaMIE \

Required python package

pip install -r requirements.txt

Mophological analyzer required:\

jumanpp
mecab (juman-dict)

Pretrained BERT required:\

NICT-BERT (NICT_BERT-base_JapaneseWikipedia_32K_BPE)

Train：

CUDA_VISIBLE_DEVICES=$SEED python clinical_joint.py \
--pretrained_model $PRETRAINED_BERT \
--train_file $TRAIN_FILE \
--dev_file $DEV_FILE \
--dev_output $DEV_OUT \
--saved_model $MODEL_DIR_TO_SAVE \
--enc_lr 2e-5 \
--batch_size 4 \
--warmup_epoch 2 \
--num_epoch 20 \
--do_train
--fp16 (apex required)

The models trained on radiography interpretation reports of Lung Cancer (LC) and general medical reports of Idiopathic Pulmonary Fibrosis (IPF) are to be availabel: link1, link2.

Test:

CUDA_VISIBLE_DEVICES=$SEED python clinical_joint.py \
--saved_model $SAVED_MODEL \
--test_file $TEST_FILE \
--test_output $TEST_OUT \
--batch_size 4

Bath Converter from XML (or raw text) to CONLL for Train/Test

Convert XML files to CONLL files for Train/Test. You can also convert raw text to CONLL-style for Test.

python data_converter.py \
--mode xml2conll \
--xml $XML_FILES_DIR \
--conll $OUTPUT_CONLL_DIR \
--cv_num 5 \ # 5-fold cross-validation, 0 presents to generate single conll file
--doc_level \ # generate document-level ([SEP] denotes sentence boundaries) or sentence-level conll files
--segmenter mecab \ # please use mecab and NICT bert currently
--bert_dir $PRETRAINED_BERT

Batch Converter from predicted CONLL to XML

python data_converter.py \
--mode conll2xml \
--xml $XML_FILES_DIR \
--conll $OUTPUT_CONLL_DIR

Citation

If you use our code in your research, please cite our work:

@inproceedings{cheng2021jamie,
   title={JaMIE: A Pipeline Japanese Medical Information Extraction System,
   author={Fei Cheng, Shuntaro Yada, Ribeka Tanaka, Eiji Aramaki, Sadao Kurohashi},
   booktitle={arXiv},
   year={2021}
}

A Japanese Medical Information Extraction Toolkit

Related tags

Overview

JaMIE: a Japanese Medical Information Extraction toolkit

Joint Japanese Medical Problem, Modality and Relation Recognition

Installation (python3.8)

Required python package

Mophological analyzer required:\

Pretrained BERT required:\

Train：

Test:

Bath Converter from XML (or raw text) to CONLL for Train/Test

Batch Converter from predicted CONLL to XML

Citation

Owner

Underwater image enhancement

Video Frame Interpolation with Transformer (CVPR2022)

null

PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

Repository for tackling Kaggle Ultrasound Nerve Segmentation challenge using Torchnet.

PRIME: A Few Primitives Can Boost Robustness to Common Corruptions

Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.

Repo for the Tutorials of Day1-Day3 of the Nordic Probabilistic AI School 2021 (https://probabilistic.ai/)

Python Implementation of Chess Playing AI with variable difficulty

Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

A simple Python library for stochastic graphical ecological models

Most popular metrics used to evaluate object detection algorithms.

This project intends to use SVM supervised learning to determine whether or not an individual is diabetic given certain attributes.

Latte: Cross-framework Python Package for Evaluation of Latent-based Generative Models

Basit bir burç modülü.