Continual Learning of Electronic Health Records (EHR).

Overview

arXiv License: MIT

Continual Learning of Longitudinal Health Records

Repo for reproducing the experiments in Continual Learning of Longitudinal Health Records (2021). Release v0.1 of the project corresponds to published results.

Experiments evaluate various continual learning strategies on standard ICU predictive tasks exhibiting covariate shift. Task outcomes are binary, and input data are multi-modal time-series from patient ICU admissions.

Setup

  1. Clone this repo to your local machine.
  2. Request access to MIMIC-III and eICU-CRD.1
  3. Download the preprocessed datasets to the /data subfolder.
  4. (Recommended) Create and activate a new virtual environment:
    python3 -m venv .venv --upgrade-deps
  5. Install dependencies:
    pip install -U wheel buildtools
    pip install -r requirements.txt

Results

To reproduce main results:

python3 main.py --train

Figures will be saved to /results/figs. Instructions to reproduce supplementary experiments can be found here. Bespoke experiments can be specified with appropriate flags e.g:

python3 main.py --domain_shift hospital --outcome mortality_48h --models CNN --strategies EWC Replay --validate --train

A complete list of available options can be found here or with python3 main.py --help.

Citation

If you use any of this code in your work, please reference us:

@misc{armstrong2021continual,
      title={Continual learning of longitudinal health records}, 
      author={J. Armstrong and D. Clifton},
      year={2021},
      eprint={2112.11944},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Python versions

Notes

Note that Temporal Domain Incremental learning experiments require linkage with original MIMIC-III dataset. Requires downloading ADMISSIONS.csv from MIMIC-III to the /data/mimic3/ folder.

Stack

For standardisation of ICU predictive task definitions, feature pre-processing, and Continual Learning method implementations, we use the following tools:

Tool Source
ICU Data MIMIC-III
eICU-CRD
Data preprocessing / task definition FIDDLE
Continual Learning strategies Avalanche
Comments
  • Change experience to class balanced replay

    Change experience to class balanced replay

    Have manually edited the replay definition for now. Will need to update avalanche and do change based on training.storage_policy.

    May also need to change memory buffer to n_tasks * buffer (since GEM etc use this number for experience-wise buffer sizes).

    opened by iacobo 1
  • Bump numpy from 1.20.3 to 1.22.0

    Bump numpy from 1.20.3 to 1.22.0

    Bumps numpy from 1.20.3 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Add Naive with no regularization?

    Add Naive with no regularization?

    Maybe add naive with no regularization? I.e. no dropout etc, to enable clearer ablation testing of naive fine tuning and inherent regularization mechanisms vs explicit CL strategy.

    opened by iacobo 0
  • CNN fails with kernel_size 5 or 7

    CNN fails with kernel_size 5 or 7

    Getting the following error (on GPU) with CNN runs with kernel_size in [5,7]:

    RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
    

    https://stackoverflow.com/questions/66600362/runtimeerror-cuda-error-cublas-status-execution-failed-when-calling-cublassge?answertab=votes#tab-top

    opened by iacobo 0
  • Add early stopping to avoid over-large number of epochs for diff models

    Add early stopping to avoid over-large number of epochs for diff models

    MLP / LSTM take shorter time to train than CNN / Transformer. Add early stopping to avoid overtraining, saturating.

    Change strategy to base strategy inheriting from strat and earlystopping plugin.

    opened by iacobo 0
  • Correct code for ROC AUC and AUPRC

    Correct code for ROC AUC and AUPRC

    Cannot average metrics over minibatches as is done for other metrics, since they depend on threshold. Need to calculate over all. Check e.g. MeanScore for inspiration on metric definition.

    opened by iacobo 0
  • Need to add code for further experiments

    Need to add code for further experiments

    plotting.plot_demographics()
    
    # Secondary experiments:
    ########################
    # Sensitivity to sequence length (4hr vs 12hr)
    # Sensitivity to replay size Naive -> replay -> Cumulative
    # Sensitivity to hyperparams of reg methods (Tune hyperparams over increasing number of tasks?)
    # Sensitivity to number of variables (full vs Vitals only e.g.)
    # Sensitivity to size of domains - e.g. white ethnicity much larger than all other groups, affect of order of sequence
    
    opened by iacobo 1
  • Ray Tune warnings

    Ray Tune warnings

    Ray Tune produces the following warnings:

    INFO registry.py:66 -- Detected unknown callable for trainable. Converting to class.
    WARNING experiment.py:295 -- No name detected on trainable. Using DEFAULT.
    

    Non-fatal, but it's annoying to have these messages bloating the console output.

    raytune 
    opened by iacobo 2
Releases(v0.1)
Owner
Jacob
Data Scientist @publichealthengland
Jacob
Learned image compression

Overview Pytorch code of our recent work A Unified End-to-End Framework for Efficient Deep Image Compression. We first release the code for Variationa

Jiaheng Liu 163 Dec 04, 2022
Google Landmark Recogntion and Retrieval 2021 Solutions

Google Landmark Recogntion and Retrieval 2021 Solutions In this repository you can find solution and code for Google Landmark Recognition 2021 and Goo

Vadim Timakin 5 Nov 25, 2022
This repo contains research materials released by members of the Google Brain team in Tokyo.

Brain Tokyo Workshop 🧠 🗼 This repo contains research materials released by members of the Google Brain team in Tokyo. Past Projects Weight Agnostic

Google 1.2k Jan 02, 2023
Multivariate Time Series Transformer, public version

Multivariate Time Series Transformer Framework This code corresponds to the paper: George Zerveas et al. A Transformer-based Framework for Multivariat

363 Jan 03, 2023
[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

153 Dec 14, 2022
Deep Learning as a Cloud API Service.

Deep API Deep Learning as Cloud APIs. This project provides pre-trained deep learning models as a cloud API service. A web interface is available as w

Wu Han 4 Jan 06, 2023
Airbus Ship Detection Challenge

Airbus Ship Detection Challenge This is an open solution to the Airbus Ship Detection Challenge. Our goals We are building entirely open solution to t

minerva.ml 55 Nov 29, 2022
Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

torchtyping Type annotations for a tensor's shape, dtype, names, ... Turn this: def batch_outer_product(x: torch.Tensor, y: torch.Tensor) - torch.Ten

Patrick Kidger 1.2k Jan 03, 2023
Basit bir burç modülü.

Bu modulu burclar hakkinda gundelik bir sekilde bilgi alin diye yaptim ve sizler icin kullanima sunuyorum. Modulun kullanimi asiri basit: Ornek Kullan

Special 17 Jun 08, 2022
Neural Articulated Radiance Field

Neural Articulated Radiance Field NARF Neural Articulated Radiance Field Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada ICCV 2021 [Paper] [Co

Atsuhiro Noguchi 144 Jan 03, 2023
masscan + nmap + Finger

说明 个人根据使用习惯修改masnmap而来的一个小工具。调用masscan做全端口扫描,再调用nmap做服务识别,最后调用Finger做Web指纹识别。工具使用场景适合风险探测排查、众测等。 使用方法 安装依赖 pip3 install -r requirements.txt -i https:/

Ryan 3 Mar 25, 2022
Distributed Evolutionary Algorithms in Python

DEAP DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data stru

Distributed Evolutionary Algorithms in Python 4.9k Jan 05, 2023
Automatic Number Plate Recognition using Contours and Convolution Neural Networks (CNN)

Cite our paper if you find this project useful https://www.ijariit.com/manuscripts/v7i4/V7I4-1139.pdf Abstract Image processing technology is used in

Adithya M 2 Jun 28, 2022
Source Code for AAAI 2022 paper "Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching"

Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching This repository is an official implementation of

HKUST-KnowComp 13 Sep 08, 2022
Implementation for Shape from Polarization for Complex Scenes in the Wild

sfp-wild Implementation for Shape from Polarization for Complex Scenes in the Wild project website | paper Code and dataset will be released soon. Int

Chenyang LEI 41 Dec 23, 2022
Progressive Image Deraining Networks: A Better and Simpler Baseline

Progressive Image Deraining Networks: A Better and Simpler Baseline [arxiv] [pdf] [supp] Introduction This paper provides a better and simpler baselin

190 Dec 01, 2022
Official implementation of "MetaSDF: Meta-learning Signed Distance Functions"

MetaSDF: Meta-learning Signed Distance Functions Project Page | Paper | Data Vincent Sitzmann*, Eric Ryan Chan*, Richard Tucker, Noah Snavely Gordon W

Vincent Sitzmann 100 Jan 01, 2023
When in Doubt: Improving Classification Performance with Alternating Normalization

When in Doubt: Improving Classification Performance with Alternating Normalization Findings of EMNLP 2021 Menglin Jia, Austin Reiter, Ser-Nam Lim, Yoa

Menglin Jia 13 Nov 06, 2022
GANTheftAuto is a fork of the Nvidia's GameGAN

Description GANTheftAuto is a fork of the Nvidia's GameGAN, which is research focused on emulating dynamic game environments. The early research done

Harrison 801 Dec 27, 2022
Deep Learning Models for Causal Inference

Extensive tutorials for learning how to build deep learning models for causal inference using selection on observables in Tensorflow 2.

Bernard J Koch 151 Dec 31, 2022