Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)

Overview

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology, LMRL Workshop, NeurIPS 2021. [Workshop] [arXiv]
Richard. J. Chen, Rahul G. Krishnan
@article{chen2022self,
  title={Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology},
  author={Chen, Richard J and Krishnan, Rahul G},
  journal={Learning Meaningful Representations of Life, NeurIPS 2021},
  year={2021}
}
DINO illustration

Summary / Main Findings:

  1. In head-to-head comparison of SimCLR versus DINO, DINO learns more effective pretrained representations for histopathology - likely due to 1) not needing negative samples (histopathology has lots of potential class imbalance), 2) capturing better inductive biases about the part-whole hierarchies of how cells are spatially organized in tissue.
  2. ImageNet features do lag behind SSL methods (in terms of data-efficiency), but are better than you think on patch/slide-level tasks. Transfer learning with ImageNet features (from a truncated ResNet-50 after 3rd residual block) gives very decent performance using the CLAM package.
  3. SSL may help mitigate domain shift from site-specific H&E stainining protocols. With vanilla data augmentations, global structure of morphological subtypes (within each class) are more well-preserved than ImageNet features via 2D UMAP scatter plots.
  4. Self-supervised ViTs are able to localize cell location quite well w/o any supervision. Our results show that ViTs are able to localize visual concepts in histopathology in introspecting the attention heads.

Updates

Stay tuned for more updates :).

  • TBA: Pretrained SimCLR and DINO models on TCGA-Lung (Larger working paper, in submission).
  • TBA: Pretrained SimCLR and DINO models on TCGA-PanCancer (Larger working paper, in submission).
  • TBA: PEP8-compliance (cleaning and organizing code).
  • 03/04/2022: Reproducible and largely-working codebase that I'm satisfied with and have heavily tested.

Pre-Reqs

We use Git LFS to version-control large files in this repository (e.g. - images, embeddings, checkpoints). After installing, to pull these large files, please run:

git lfs pull

Pretrained Models

SIMCLR and DINO models were trained for 100 epochs using their vanilla training recipes in their respective papers. These models were developed on 2,055,742 patches (256 x 256 resolution at 20X magnification) extracted from diagnostic slides in the TCGA-BRCA dataset, and evaluated via K-NN on patch-level datasets in histopathology.

Note: Results should be taken-in w.r.t. to the size of dataset and duraration of training epochs. Ideally, longer training with larger batch sizes would demonstrate larger gains in SSL performance.

Arch SSL Method Dataset Epochs Dim K-NN Download
ResNet-50 Transfer ImageNet N/A 1024 0.935 N/A
ResNet-50 SimCLR TCGA-BRCA 100 2048 0.938 Backbone
ViT-S/16 DINO TCGA-BRCA 100 384 0.941 Backbone

Data Download + Data Preprocessing

For CRC-100K and BreastPathQ, pre-extracted embeddings are already available and processed in ./embeddings_patch_library. See patch_extraction_utils.py on how these patch datasets were processed.

Additional Datasets + Custom Implementation: This codebase is flexible for feature extraction on a variety of different patch datasets. To extend this work, simply modify patch_extraction_utils.py with a custom Dataset Loader for your dataset. As an example, we include BCSS (results not yet updated in this work).

  • BCSS (v1): You can download the BCSS dataset from the official Grand Challenge link. For this dataset, we manually developed the train and test dataset splits and labels using majority-voting. Reproducibility for the raw BCSS dataset may be not exact, but we include the pre-extracted embeddings of this dataset in ./embeddings_patch_library (denoted as version 1).

Evaluation: K-NN Patch-Level Classification on CRC-100K + BreastPathQ

Run the notebook patch_extraction.ipynb, followed by patch_evaluation.ipynb. The evaluation notebook should run "out-of-the-box" with Git LFS.

table2

Evaluation: Slide-Level Classification on TCGA-BRCA (IDC versus ILC)

Install the CLAM Package, followed by using the 10-fold cross-validation splits made available in ./slide_evaluation/10foldcv_subtype/tcga_brca. Tensorboard train + validation logs can visualized via:

tensorboard --logdir ./slide_evaluation/results/
table1

Visualization: Creating UMAPs

Install umap-learn (can be tricky to install if you have incompatible dependencies), followed by using the following code snippet in patch_extraction_utils.py, and is used in patch_extraction.ipynb to create Figure 4.

UMAP

Visualization: Attention Maps

Attention visualizations (reproducing Figure 3) can be performed via walking through the following notebook at attention_visualization_256.ipynb.

Attention Visualization

Issues

  • Please open new threads or report issues directly (for urgent blockers) to [email protected].
  • Immediate response to minor issues may not be available.

Acknowledgements, License & Usage

  • Part of this work was performed while at Microsoft Research. We thank the BioML group at Microsoft Research New England for their insightful feedback.
  • This work is still under submission in a formal proceeding. Still, if you found our work useful in your research, please consider citing our paper at:
@article{chen2022self,
  title={Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology},
  author={Chen, Richard J and Krishnan, Rahul G},
  journal={Learning Meaningful Representations of Life, NeurIPS 2021},
  year={2021}
}

© This code is made available under the GPLv3 License and is available for non-commercial academic purposes.

Owner
Richard Chen
Ph.D. Candidate at Harvard
Richard Chen
BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition

Rui Qian 17 Dec 12, 2022
Fibonacci Method Gradient Descent

An implementation of the Fibonacci method for gradient descent, featuring a TKinter GUI for inputting the function / parameters to be examined and a matplotlib plot of the function and results.

Emma 1 Jan 28, 2022
Reproducing-BowNet: Learning Representations by Predicting Bags of Visual Words

Reproducing-BowNet Our reproducibility effort based on the 2020 ML Reproducibility Challenge. We are reproducing the results of this CVPR 2020 paper:

6 Mar 16, 2022
Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

Alex Robey 16 Sep 19, 2022
Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.

Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.

beringresearch 285 Jan 04, 2023
Real time sign language recognition

The proposed work aims at converting american sign language gestures into English that can be understood by everyone in real time.

Mohit Kaushik 6 Jun 13, 2022
Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases. Ivy wraps the functional APIs of existing frameworks. Framework-agnostic functions, libraries an

Ivy 8.2k Jan 02, 2023
An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

customer_segmentation_with_rfm Business Problem : An e-commerce company wants to

Buse Yıldırım 3 Jan 06, 2022
Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning

structshot Code and data for paper "Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning", Yi Yang and Arz

ASAPP Research 47 Dec 27, 2022
ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Introduction The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into ss

55 Nov 09, 2022
Just Randoms Cats with python

Random-Cat Just Randoms Cats with python.

OriCode 2 Dec 21, 2021
Automated Evidence Collection for Fake News Detection

Automated Evidence Collection for Fake News Detection This is the code repo for the Automated Evidence Collection for Fake News Detection paper accept

Mrinal Rawat 2 Apr 12, 2022
🛰️ Awesome Satellite Imagery Datasets

Awesome Satellite Imagery Datasets List of aerial and satellite imagery datasets with annotations for computer vision and deep learning. Newest datase

Christoph Rieke 3k Jan 03, 2023
This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

TransMix: Attend to Mix for Vision Transformers This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transf

Jie-Neng Chen 130 Jan 01, 2023
ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

This repo contains some of the codes for the following paper Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code

Xuewen Yang 56 Dec 08, 2022
Source code for Zalo AI 2021 submission

zalo_ltr_2021 Source code for Zalo AI 2021 submission Solution: Pipeline We use the pipepline in the picture below: Our pipeline is combination of BM2

128 Dec 27, 2022
Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Code for the ToeplitzLDA classifier proposed in here. The classifier conforms sklearn and can be used as a drop-in replacement for other LDA classifiers. For in-depth usage refer to the learning from

Jan Sosulski 5 Nov 07, 2022
Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients

LSF-SAC Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy G

Hanhan 2 Aug 14, 2022
The repository contains reproducible PyTorch source code of our paper Generative Modeling with Optimal Transport Maps, ICLR 2022.

Generative Modeling with Optimal Transport Maps The repository contains reproducible PyTorch source code of our paper Generative Modeling with Optimal

Litu Rout 30 Dec 22, 2022
Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Oh-My-Face This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, e

AiLin Huang 51 Nov 17, 2022