7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

Overview

kaggle-hpa-2021-7th-place-solution

Code for 7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle.

A description of the method can be found in this post in the kaggle discussion.

Dataset Preparation

Resize Images

# Resize train images to 768x768
python scripts/hap_segmenter/create_cell_mask.py resize_image \
    --input_directory data/input/hpa-single-cell-image-classification.zip/train \
    --output_directory data/input/hpa-768768.zip \
    --image_size 768
# Resize train images to 1536x1536
python scripts/hap_segmenter/create_cell_mask.py resize_image \
    --input_directory data/input/hpa-single-cell-image-classification.zip/train \
    --output_directory data/input/hpa-1536.zip \
    --image_size 1536

# Resize test images to 768x768
python scripts/hpa_segmenter/create_cell_mask.py resize_image \
    --input_directory /kaggle/input/hpa-single-cell-image-classification/test \
    --output_directory data/input/hpa-768-test.zip \
    --image_size 768
# Resize test images to 1536x1536
python scripts/hpa_segmenter/create_cell_mask.py resize_image \
    --input_directory /kaggle/input/hpa-single-cell-image-classification/test \
    --output_directory data/input/hpa-1536-test.zip \
    --image_size 1536

You can specify a directory in a zip file in the same way as a normal directory.

Download Public HPA

Download all images in kaggle_2021.tsv in this dataset, resize them into 768x768 and 1536x1536, and archive them as data/input/hpa-public-768.zip and data/input/hpa-public-1536.zip.

Create Cell Mask

# Create cell masks for the Kaggle train set with 1536x1536
python scripts/hpa_segmenter/create_cell_mask.py create_cell_mask \
    --input_directory data/input/hpa-1536.zip \
    --output_directory data/input/hpa-1536-mask-v2.zip \
    --label_cell_scale_factor 1.0

# Resize the masks to 768x768
python scripts/hpa_segmenter/create_cell_mask.py resize_cell_mask \
    --input_directory data/input/hpa-1536-mask-v2.zip \
    --output_directory data/input/hpa-768-mask-v2-from-1536.zip \
    --image_size 768

# Create cell masks for the Public HPA dataset with 1536x1536
python scripts/hpa_segmenter/create_cell_mask.py create_cell_mask \
    --input_directory data/input/hpa-public-1536.zip/hpa-public-1536 \
    --output_directory data/input/hpa-public-1536-mask-v2.zip \
    --label_cell_scale_factor 1.0

# Resize the masks to 768x768
python scripts/hpa_segmenter/create_cell_mask.py resize_cell_mask \
    --input_directory data/input/hpa-public-1536-mask-v2.zip \
    --output_directory data/input/hpa-public-768-mask-v2-from-1536.zip \
    --image_size 768

# Create cell masks for the test set with the original resolution
# Run with `--label_cell_scale_factor = 0.5` to save inference time
python scripts/hpa_segmenter/create_cell_mask.py create_cell_mask \
    --input_directory /kaggle/input/hpa-single-cell-image-classification/test \
    --output_directory data/input/hpa-test-mask-v2.zip \
    --label_cell_scale_factor 0.5

# Resize the masks to 1536x1536
python scripts/hpa_segmenter/create_cell_mask.py resize_cell_mask \
    --input_directory data/input/hpa-test-mask-v2.zip \
    --output_directory data/input/hpa-test-mask-v2-1536.zip \
    --image_size 1536

# Resize the masks to 768x768
python scripts/hpa_segmenter/create_cell_mask.py resize_cell_mask \
    --input_directory data/input/hpa-test-mask-v2.zip \
    --output_directory data/input/hpa-test-mask-v2-768.zip \
    --image_size 768

Create Input for Cell-level Classifier

# Create cell-level inputs for the Kaggle train set using 768x768 images as fixed scale image.
python scripts/hap_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-768768.zip \
    --cell_mask_directory data/input/hpa-768-mask-v2-from-1536.zip \
    --output_directory data/input/hpa-cell-crop-v2-192-from-768.zip \
    --image_size 192

# Create cell-level inputs for the Public HPA dataset using 768x768 images as fixed scale image.
python scripts/hap_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-public-768.zip \
    --cell_mask_directory data/input/hpa-public-768-mask-v2-from-1536.zip \
    --output_directory data/input/hpa-public-cell-crop-v2-192-from-768.zip \
    --image_size 192

# Create cell-level inputs for the Kaggle train set using 1536x1536 images as fixed scale image.
python scripts/hap_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-1536.zip \
    --cell_mask_directory data/input/hpa-1536-mask-v2.zip \
    --output_directory data/input/hpa-cell-crop-v2-192-from-1536.zip \
    --image_size 192

# Create cell-level inputs for the Public HPA dataset using 1536x1536 images as fixed scale image.
python scripts/hap_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-public-1536.zip \
    --cell_mask_directory data/input/hpa-public-1536-mask-v2.zip \
    --output_directory data/input/hpa-public-cell-crop-v2-192-from-1536.zip \
    --image_size 192

# Create cell-level inputs for the test set using 768x768 images as fixed scale image.
python scripts/hpa_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-768768-test.zip \
    --cell_mask_directory data/input/hpa-test-mask-v2-768.zip \
    --output_directory data/input/hpa-test-cell-crop-v2-192-from-768.zip \
    --image_size 192

# Create cell-level inputs for the test set using 1536x1536 images as fixed scale image.
python scripts/hpa_segmenter/create_cell_mask.py crop_and_resize_cell \
    --image_directory data/input/hpa-1536-test.zip \
    --cell_mask_directory data/input/hpa-test-mask-v2-1536.zip \
    --output_directory data/input/hpa-test-cell-crop-v2-192-from-1536.zip \
    --image_size 192

Training

# Train image-level classifier
python scripts/cam_consistency_training/run.py train \
    --config_path scripts/cam_consistency_training/configs/${CONFIG_NAME}.yaml

# Train cell-level classifier
python scripts/cell_crop/run.py train \
    --config_path scripts/cell_crop/configs/${CONFIG_NAME}.yaml

If you want to train on multiple GPUs, use a launcher like torch.distributed.launch and pass --local_rank option. You can override the fields in the config by passing an argument like field_name=${value} (e.g. fold_index=1). We trained 5 folds for all models used in the final submission pipeline. The config files are located in scripts/cam_consistency_training/configs and scripts/cell_crop/configs. We trained the models in the following order.

  1. scripts/cam_consistency_training/configs/eff-b2-focal-alpha1-cutmix-pubhpa-maskv2.yaml
  2. scripts/cam_consistency_training/configs/eff-b5-focal-alpha1-cutmix-pubhpa-maskv2.yaml
  3. scripts/cam_consistency_training/configs/eff-b7-focal-alpha1-cutmix-pubhpa-maskv2.yaml
  4. scripts/cam_consistency_training/configs/eff-b2-cutmix-pubhpa-768-to-1536.yaml
  5. Do predict_valid and concat_valid_predictions (described below) for each model and save the average of the output files under data/working/consistency_training/b2-1536-b2-b5-b7-768-avg/.
  6. scripts/cam_consistency_training/configs/eff-b2-focal-stage2-b2b2b5b7avg.yaml
  7. scripts/cell_crop/configs/resnest50-bce-from768-cutmix-softpl.yaml
  8. Do predict_valid and concat_valid_predictions for each model and save the average of the output files under data/working/image-level-and-cell-crop-both-5folds/.
  9. scripts/cam_consistency_training/configs/eff-b2-focal-stage3.yaml
  10. scripts/cam_consistency_training/configs/eff-b2-focal-stage3-cos.yaml
  11. scripts/cell_crop/configs/resnest50-bce-from768-stage3.yaml
  12. scripts/cell_crop/configs/resnest50-bce-from1536-stage3-cos.yaml

Inference

Validation Set

# Image-level classifier inference
python scripts/cam_consistency_training/run.py predict_valid \
    --config_path scripts/cam_consistency_training/configs/${CONFIG_NAME}.yaml

# Cell-level classifier inference
python scripts/cell_crop/run.py predict_valid \
    --config_path scripts/cell_crop/configs/${CONFIG_NAME}.yaml

# Concatenate the predictions for each fold to obtain the OOF prediction for the entire training data
python scripts/cam_consistency_training/run.py concat_valid_predictions \
    --config_path scripts/cam_consistency_training/configs/${CONFIG_NAME}.yaml
python scripts/cell_crop/run.py concat_valid_predictions \
    --config_path scripts/cell_crop/configs/${CONFIG_NAME}.yaml

Test Set

# Image-level classifier inference
python scripts/cam_consistency_training/run.py predict_test \
    --config_path scripts/cam_consistency_training/configs/${CONFIG_NAME}.yaml

# Cell-level classifier inference
python scripts/cell_crop/run.py predict_test \
    --config_path scripts/cell_crop/configs/${CONFIG_NAME}.yaml

# Make our final submission with post-processing
python scripts/average_predictions.py \
    --orig_size_cell_mask_directory data/input/hpa-test-mask-v2.zip \
    "data/working/consistency_training/eff-b2-focal-stage3/0" \
    "data/working/consistency_training/eff-b2-focal-stage3/1" \
    "data/working/consistency_training/eff-b2-focal-stage3/2" \
    "data/working/consistency_training/eff-b2-focal-stage3/3" \
    "data/working/consistency_training/eff-b2-focal-stage3/4" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/0" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/1" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/2" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/3" \
    "data/working/consistency_training/eff-b2-focal-stage3-cos/4" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/0" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/1" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/2" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/3" \
    "data/working/cell_crop/resnest50-bce-from768-stage3/4" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/0" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/1" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/2" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/3" \
    "data/working/cell_crop/resnest50-bce-from1536-stage3-cos/4" \
    --edge_area_threshold 80000 --center_area_threshold 32000

Use the code on Kaggle Notebook

Use docker to zip the source code and the wheels of the dependencies and upload them as a dataset.

docker run --rm -it -v /path/to/this/repo:/tmp/workspace -w /tmp/workspace/ gcr.io/kaggle-images/python bash ./build_zip.sh

In Kaggle Notebook, when you copy the code as shown below, you can run it the same way as your local environment.

# Make a working directory
!mkdir -p /kaggle/tmp

# Change the current directory
cd /kaggle/tmp

# Copy source code from the uploaded dataset
!cp -r /kaggle/input/<your-dataset-name>/* .

# You can use it as well as local environment
!python scripts/hpa_segmenter/create_cell_mask.py create_cell_mask ...
Plenoxels: Radiance Fields without Neural Networks

Plenoxels: Radiance Fields without Neural Networks Alex Yu*, Sara Fridovich-Keil*, Matthew Tancik, Qinhong Chen, Benjamin Recht, Angjoo Kanazawa UC Be

Sara Fridovich-Keil 81 Dec 25, 2022
Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training

ColossalAI An integrated large-scale model training system with efficient parallelization techniques. arXiv: Colossal-AI: A Unified Deep Learning Syst

HPC-AI Tech 7.9k Jan 08, 2023
Goal of the project : Detecting Temporal Boundaries in Sign Language videos

MVA RecVis course final project : Goal of the project : Detecting Temporal Boundaries in Sign Language videos. Sign language automatic indexing is an

Loubna Ben Allal 6 Dec 21, 2022
Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

TailCalibX : Feature Generation for Long-tail Classification by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi [arXiv] [

Rahul Vigneswaran 34 Jan 02, 2023
Emotion Recognition from Facial Images

Reconhecimento de Emoções a partir de imagens faciais Este projeto implementa um classificador simples que utiliza técncias de deep learning e transfe

Gabriel 2 Feb 09, 2022
Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive

google-drive-to-sqlite Create a SQLite database containing metadata from Google

Simon Willison 140 Dec 04, 2022
PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

safe-control-gym Physics-based CartPole and Quadrotor Gym environments (using PyBullet) with symbolic a priori dynamics (using CasADi) for learning-ba

Dynamic Systems Lab 300 Dec 28, 2022
Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

forward-thinking-pytorch Pytorch implementation of Forward Thinking: Building and Training Neural Networks One Layer at a Time Requirements Python 2.7

Kim Heecheol 65 Oct 06, 2022
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Dec 30, 2022
COIN the currently largest dataset for comprehensive instruction video analysis.

COIN Dataset COIN is the currently largest dataset for comprehensive instruction video analysis. It contains 11,827 videos of 180 different tasks (i.e

86 Dec 28, 2022
Provide baselines and evaluation metrics of the task: traffic flow prediction

Note: This repo is adpoted from https://github.com/UNIMIBInside/Smart-Mobility-Prediction. Due to technical reasons, I did not fork their code. Introd

Zhangzhi Peng 11 Nov 02, 2022
MT3: Multi-Task Multitrack Music Transcription

MT3: Multi-Task Multitrack Music Transcription MT3 is a multi-instrument automatic music transcription model that uses the T5X framework. This is not

Magenta 867 Dec 29, 2022
"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

SOLQ: Segmenting Objects by Learning Queries This repository is an official implementation of the paper SOLQ: Segmenting Objects by Learning Queries.

MEGVII Research 179 Jan 02, 2023
CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf] The official repository for TransReID: Transformer-based Object Re-Identificati

DamoCV 569 Dec 30, 2022
This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

CV Lab @ Yonsei University 36 Nov 04, 2022
Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Emotion and Theme Recognition in Music The repository contains code for the submission of the lileonardo team to the 2021 Emotion and Theme Recognitio

Vincent Bour 8 Aug 02, 2022
Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

(Brian) JoonHo Lee 24 Dec 12, 2022
Net2net - Network-to-Network Translation with Conditional Invertible Neural Networks

Net2Net Code accompanying the NeurIPS 2020 oral paper Network-to-Network Translation with Conditional Invertible Neural Networks Robin Rombach*, Patri

CompVis Heidelberg 206 Dec 20, 2022
Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

MUSIQ: Multi-Scale Image Quality Transformer Unofficial pytorch implementation of the paper "MUSIQ: Multi-Scale Image Quality Transformer" (paper link

41 Jan 02, 2023