A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

Overview

MCILBoost

Project | CVPR Paper | MIA Paper
Contact: Jun-Yan Zhu (junyanz at cs dot cmu dot edu)

Overview

This is the authors' implementation of MCIL-Boost method described in:
[1] Multiple Clustered Instance Learning for Histopathology Cancer Image Segmentation, Clustering, and Classification.
Yan Xu*, Jun-Yan Zhu*, Eric Chang, and Zhuowen Tu (*equal contribution)
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.

[2] Weakly Supervised Histopathology Cancer Image Segmentation and Classification
Yan Xu, Jun-Yan Zhu, Eric I-Chao Chang, Maode Lai, and Zhuowen Tu
In Medical Image Analysis, 2014.

Please cite our papers if you use our code for your research.

This package consists of the following two multiple-instance learning (MIL) methods:

  • MIL-Boost [Viola et al. 2006]: set c = 1
  • MCIL-Boost [1] [2]: set c > 1

The core of this package is a command-line interface written in C++. Various Matlab helper functions are provided to help users easily train/test MCIL-Boost model, perform cross-validation, and evaluate the performance.

System Requirement

  • Linux and Windows.
  • For Linux, the code is compiled by gcc 4.8.2 under Ubuntu 14.04.

Installation

  • Download and unzip the code.
    • For Linux users, type "chmod +x MCILBoost".
  • Open Matlab and run "demoToy.m".
  • To use the command-line interface, see "Command Usage".
  • To use Matlab functions, see "Matlab helper functions"; You can modify "SetParamsToy.m" and "demoToy.m" to run your own experiments.

Quick Examples

(Windows: MCILBoost.exe; Linux: ./MCILBoost)
An example for training:
MCILBoost.exe -v 2 -t 0 -c 2 -n 150 -s 0 -r 20 toy.data toy.model
An example for testing:
MCILBoost.exe -v 2 -t 1 -c 2 toy.data toy.model toy.result

Command Usage ([ ]: options)

MCILBoost.exe [-v verbose] [-t mode] [-c #clusters] [-n #weakClfs] [-s softmax] data_file model_file [result_file] (No need to specifiy c, n, s, r for test as the program will copy these parameters from the model_file)

-v verbose: shows details about the runtime output (default = 1) 0 -- no output 1 -- some output 2 -- more output

-t mode: set the training mode (default=0) 0 -- train a model 1 -- test a model

-c #clusters: set the number of clusters in positive bags (default = 1) c = 1 -- train a MIL-Boost model c > 1 -- train a MCIL-Boost model with multiple clusters

-n #weakClfs: set the maximum number of weak classifiers (default = 150)

-s softmax: set the softmax type: (default s = 0) 0 -- GM 1 -- LSE

-r exponent: set the exponent used in GM and LSE (default r = 20)

data_file: set the path for input data.

model_file: set the path for the model file.

result_file: set the path for result file. If result_file is not specified, result_file = data_file + '.result'

Matlab helper functions

  • MCILBoost.m: main entry function: model training/testing, and cross-validation.
  • SetParams.m: Set parameters for MCILBoost.m. You need to modify this file to run your own experiment.
  • TrainModel.m: train a model, call MCIL-Boost command line.
  • TestModel.m: test a model, call MCIL-Boost command line.
  • CrossValidate.m: split the data into n-fold, perform n-fold cross-validation, and report performance.
  • ReadData.m: read Matlab data from a text file.
  • WriteData.m: write Matlab data to a text file.
  • ReadResult.m: read Matlab result data from a text file.
  • MeasureResult.m: evaluate performance in terms of accuracy and auc (area under the curve).
  • AUC: compute the area under ROC curve given prediction and ground truth labels.
  • demoToy.m: demo script for toy data.
  • SetParamsToy.m: set parameters for demoToy.
  • demo1.m: demo script for Fox, Tiger, Elephant experiment.
  • SetParamsDemo1.m: set parameters for demo1.
  • demo2.m: demo script for SIVAL experiment.
  • SetParamsDemo2.m: set parameters for demo2.

Summary of Benchmark Results

  • I provide two scripts for running experiments on publicly available MIL benchmarks.
    • "demo1.m": experiments on Fox, Tiger, Elephant dataset.
      The MIL-Boost achieved 0.61 (Fox), 0.81 (Tiger), 0.82 (Elephant) on 10-fold cross-validation over 10 runs.
    • "demo2.m": experiments on SIVAL dataset. There are 180 positive bags (3 clusters), and 180 negative bags. While multiple clusters appear in positive bags, MCIL-Boost works better than MIL-Boost does.
      MIL-Boost (c=1): mean_acc = 0.742, mean_auc = 0.824
      MCIL-Boost (c=3): mean_acc = 0.879, mean_auc = 0.944
  • Note: See "demo1.m" and "demo2.m" for details.

Input Format

  • Note: You can use Matlab function "ReadData.m" and "WriteData.m" to read/write Matlab data from/to the text file.
  • Description: the input format is similar to the format used in LIBSVM and MILL package. The software also supports a sparse format. In the first line, you first need to specify the number of all instances, and the number of feature dimensions. Each line represents one instance, which has an instance id, bag id, and the label id (>= 1 for positive bags, and 0 for negative bags). Each feature value is represented as a : pair where is the index of the feature (starting from 1)
  • Format:
    : : : : ...
    : : : : ...
  • Example: A toy example that contains two negative bags and two positive bags. (see "toy.data") The negative instance is always (0, 0, 0) while there are two clusters of positive instances (0, 1, 0) and (0, 0, 1)
    8 3
    0:0:0 1:0 2:0 3:0
    1:0:0 1:0 2:0 3:0
    2:1:0 1:0 2:0 3:0
    3:1:0 1:0 2:0 3:0
    4:2:1 1:0 2:1 3:0
    5:2:1 1:0 2:0 3:0
    6:3:1 1:0 2:0 3:1
    7:3:1 1:0 2:0 3:0

Output Format

  • Note: You can use Matlab function "ReadResult.m" to load the Matlab data from the result file.

  • Description: The software outputs four kinds of predictions (see more details in the paper):

    • overall bag-level prediction p_i (the probability of the bag x_i being positive bag)
    • cluster-wise bag-level prediction p_i^k (the probability of the bag x_i belonging to k-th cluster)
    • overall instance-level prediction p_{ij} (the probability of the instance x_{ij} being positive instance)
    • cluster-wise instance-level prediction p_{ij}^k (the probability of the instance x_{ij} belonging to the k-th cluster)
    • In the first line, the software outputs the number of bags, and the number of clusters. Then for each bag, the software outputs the bag-level information and prediction (bag id, number of instances, ground truth label, number of clusters, and p_i).The software also outputs the bag-level prediction for each cluster (cluster id and prediction p_i^k for each cluster). Then for each instance, the software outputs the instance-level prediction (instance id and prediction p_{ij}) and instance-level prediction for each cluster (cluster_id and prediction p_{ij}^k)
  • Format:
    #bag= #cluster=
    bag_id= #insts= label= #cluster= pred=
    cluster_id= pred= cluster_id= pred= ...
    inst_id= pred= cluster_id= pred= cluster_id= pred= inst_id= pred= cluster_id= pred= cluster_id= pred= ...
    ...

  • Example: The output of the toy example:
    #bags=4 #clusters=2
    bag_id=0 #insts=2 label=0 #clusters=2 pred=0
    cluster_id=0 pred=0 cluster_id=1 pred=0
    inst_id=0 pred=0 cluster_id=0 pred=0 cluster_id=1 pred=0
    inst_id=1 pred=0 cluster_id=0 pred=0 cluster_id=1 pred=0
    bag_id=1 #insts=2 label=0 #clusters=2 pred=0
    cluster_id=0 pred=0 cluster_id=1 pred=0
    inst_id=0 pred=0 cluster_id=0 pred=0 cluster_id=1 pred=0
    inst_id=1 pred=0 cluster_id=0 pred=0 cluster_id=1 pred=0
    bag_id=2 #insts=2 label=1 #clusters=2 pred=1
    cluster_id=0 pred=1 cluster_id=1 pred=0
    inst_id=0 pred=1 cluster_id=0 pred=1 cluster_id=1 pred=0
    inst_id=1 pred=0 cluster_id=0 pred=0 cluster_id=1 pred=0
    bag_id=3 #insts=2 label=1 #clusters=2 pred=1
    cluster_id=0 pred=0 cluster_id=1 pred=1
    inst_id=0 pred=1 cluster_id=0 pred=0 cluster_id=1 pred=1
    inst_id=1 pred=0 cluster_id=0 pred=0 cluster_id=1 pred=0

    Credit

    Part of this code is based on the work by Piotr Dollar and Boris Babenko.

Owner
Jun-Yan Zhu
Understanding and creating pixels.
Jun-Yan Zhu
🙄 Difficult algorithm, Simple code.

🎉TensorFlow2.0-Examples🎉! "Talk is cheap, show me the code." ----- Linus Torvalds Created by YunYang1994 This tutorial was designed for easily divin

1.7k Dec 25, 2022
yolov5 deepsort 行人 车辆 跟踪 检测 计数

yolov5 deepsort 行人 车辆 跟踪 检测 计数 实现了 出/入 分别计数。 默认是 南/北 方向检测,若要检测不同位置和方向,可在 main.py 文件第13行和21行,修改2个polygon的点。 默认检测类别:行人、自行车、小汽车、摩托车、公交车、卡车。 检测类别可在 detect

554 Dec 30, 2022
An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

0 May 06, 2022
Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

Pytorch Code for VideoLT [Website][Paper] Updates [10/29/2021] Features uploaded to Google Drive, for access please send us an e-mail: zhangxing18 at

Skye 26 Sep 18, 2022
Aquarius - Enabling Fast, Scalable, Data-Driven Virtual Network Functions

Aquarius Aquarius - Enabling Fast, Scalable, Data-Driven Virtual Network Functions NOTE: We are currently going through the open-source process requir

Zhiyuan YAO 0 Jun 02, 2022
Code for the paper "Next Generation Reservoir Computing"

Next Generation Reservoir Computing This is the code for the results and figures in our paper "Next Generation Reservoir Computing". They are written

OSU QuantInfo Lab 105 Dec 20, 2022
Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Thanks to the low storage cost and high query speed, cross-view hashing (CVH) has been successfully used for similarity search in multimedia retrieval. However, most existing CVH methods use all view

4 Nov 19, 2022
PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

Zhengqi Li 585 Jan 04, 2023
Extracting and filtering paraphrases by bridging natural language inference and paraphrasing

nli2paraphrases Source code repository accompanying the preprint Extracting and filtering paraphrases by bridging natural language inference and parap

Matej Klemen 1 Mar 09, 2022
code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology"

GIANT Code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology" https://arxiv.org/pdf/2004.02118.pdf Please cite our paper if this pr

Excalibur 39 Dec 29, 2022
CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss This is official implement of "

程星 87 Dec 24, 2022
Evaluating different engineering tricks that make RL work

Reinforcement Learning Tricks, Index This repository contains the code for the paper "Distilling Reinforcement Learning Tricks for Video Games". Short

Anssi 15 Dec 26, 2022
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

Shichao Li 138 Dec 09, 2022
A non-linear, non-parametric Machine Learning method capable of modeling complex datasets

Fast Symbolic Regression Symbolic Regression is a non-linear, non-parametric Machine Learning method capable of modeling complex data sets. fastsr aim

VAMSHI CHOWDARY 3 Jun 22, 2022
Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

Wenqi Shao 20 Oct 09, 2022
🕵 Artificial Intelligence for social control of public administration

Non-tech crash course into Operação Serenata de Amor Tech crash course into Operação Serenata de Amor Contributing with code and tech skills Supportin

Open Knowledge Brasil - Rede pelo Conhecimento Livre 4.4k Dec 31, 2022
The source codes for TME-BNA: Temporal Motif-Preserving Network Embedding with Bicomponent Neighbor Aggregation.

TME The source codes for TME-BNA: Temporal Motif-Preserving Network Embedding with Bicomponent Neighbor Aggregation. Our implementation is based on TG

2 Feb 10, 2022
EM-POSE 3D Human Pose Estimation from Sparse Electromagnetic Trackers.

EM-POSE: 3D Human Pose Estimation from Sparse Electromagnetic Trackers This repository contains the code to our paper published at ICCV 2021. For ques

Facebook Research 62 Dec 14, 2022