Cross-Task Consistency Learning Framework for Multi-Task Learning

Related tags

Deep Learningxtask_mt
Overview

Cross-Task Consistency Learning Framework for Multi-Task Learning

Tested on

  • numpy(v1.19.1)
  • opencv-python(v4.4.0.42)
  • torch(v1.7.0)
  • torchvision(v0.8.0)
  • tqdm(v4.48.2)
  • matplotlib(v3.3.1)
  • seaborn(v0.11.0)
  • pandas(v.1.1.2)

Data

Cityscapes (CS)

Download Cityscapes dataset and put it in a subdirectory named ./data/cityscapes. The folder should have the following subfolders:

  • RGB image in folder leftImg8bit
  • Segmentation in folder gtFine
  • Disparity maps in folder disparity

NYU

We use the preprocessed NYUv2 dataset provided by this repo. Download the dataset and put it in the dataset folder in ./data/nyu.

Model

The model consists of one encoder (ResNet) and two decoders, one for each task. The decoders outputs the predictions for each task ("direct predictions"), which are fed to the TaskTransferNet.
The objective of the TaskTranferNet is to predict the other task given a prediction image as an input (Segmentation prediction -> Depth prediction, vice versa), which I refer to as "transferred predictions"

Loss function

When computing the losses, the direct predictions are compared with the target while the transferred predictions are compared with the direct predictions so that they "align themselves".
The total loss consists of 4 different losses:

  • direct segmentation loss: CrossEntropyLoss()
  • direct depth loss: L1() or MSE() or logL1() or SmoothL1()
  • transferred segmentation loss:
    CrossEntropyLoss() or KLDivergence()
  • transferred depth loss: L1() or SSIM()

* Label smoothing: To "smooth" the one-hot probability by taking some of the probability from the correct class and distributing it among other classes.
* SSIM: Structural Similarity Loss

Flags

The flags are the same for both datasets. The flags and its usage are as written below,

Flag Name Usage Comments
input_path Path to dataset default is data/cityscapes (CS) or data/nyu (NYU)
height height of prediction default: 128 (CS) or 288 (NYU)
width width of prediction default: 256 (CS) or 384 (NYU)
epochs # of epochs default: 250 (CS) or 100 (NYU)
enc_layers which encoder to use default: 34, can choose from 18, 34, 50, 101, 152
use_pretrain toggle on to use pretrained encoder weights available for both datasets
batch_size batch size default: 8 (CS) or 6 (NYU)
scheduler_step_size step size for scheduler default: 80 (CS) or 60 (NYU), note that we use StepLR
scheduler_gamma decay rate of scheduler default: 0.5
alpha weight of adding transferred depth loss default: 0.01 (CS) or 0.0001 (NYU)
gamma weight of adding transferred segmentation loss default: 0.01 (CS) or 0.0001 (NYU)
label_smoothing amount of label smoothing default: 0.0
lp loss fn for direct depth loss default: L1, can choose from L1, MSE, logL1, smoothL1
tdep_loss loss fn for transferred depth loss default: L1, can choose from L1 or SSIM
tseg_loss loss fn for transferred segmentation loss default: cross, can choose from cross or kl
batch_norm toggle to enable batch normalization layer in TaskTransferNet slightly improves segmentation task
wider_ttnet toggle to double the # of channels in TaskTransferNet
uncertainty_weights toggle to use uncertainty weights (Kendall, et al. 2018) we used this for best results
gradnorm toggle to use GradNorm (Chen, et al. 2018)

Training

Cityscapes

For the Cityscapes dataset, there are two versions of segmentation task, which are 7-classes task and 19-classes task (Use flag 'num_classes' to switch tasks, default is 7).
So far, the results show near-SOTA for 7-class segmentation task + depth estimation.

ResNet34 was used as the encoder, L1() for direct depth loss and CrossEntropyLoss() for transferred segmentation loss.
The hyperparameter weights for both transferred predictions were 0.01.
I used Adam as my optimizer with an initial learning rate of 0.0001 and trained for 250 epochs with batch size 8. The learning rate was halved every 80 epochs.

To reproduce the code, use the following:

python main_cross_cs.py --uncertainty_weights

NYU

Our results show SOTA for NYU dataset.

ResNet34 was used as the encoder, L1() for direct depth loss and CrossEntropyLoss() for transferred segmentation loss.
The hyperparameter weights for both transferred predictions were 0.0001.
I used Adam as my optimizer with an initial learning rate of 0.0001 and trained for 100 epochs with batch size 6. The learning rate was halved every 60 epochs.

To reproduce the code, use the following:

python main_cross_nyu.py --uncertainty_weights

Comparisons

Evaluation metrics are the following:

Segmentation

  • Pixel accuracy (Pix Acc): percentage of pixels with the correct label
  • mIoU: mean Intersection over Union

Depth

  • Absolute Error (Abs)
  • Absolute Relative Error (Abs Rel): Absolute error divided by ground truth depth

The results are the following:

Cityscapes

Models mIoU Pix Acc Abs Abs Rel
MTAN 53.04 91.11 0.0144 33.63
KD4MTL 52.71 91.54 0.0139 27.33
PCGrad 53.59 91.45 0.0171 31.34
AdaMT-Net 62.53 94.16 0.0125 22.23
Ours 66.51 93.56 0.0122 19.40

NYU

Models mIoU Pix Acc Abs Abs Rel
MTAN* 21.07 55.70 0.6035 0.2472
MTAN† 20.10 53.73 0.6417 0.2758
KD4MTL* 20.75 57.90 0.5816 0.2445
KD4MTL† 22.44 57.32 0.6003 0.2601
PCGrad* 20.17 56.65 0.5904 0.2467
PCGrad† 21.29 54.07 0.6705 0.3000
AdaMT-Net* 21.86 60.35 0.5933 0.2456
AdaMT-Net† 20.61 58.91 0.6136 0.2547
Ours† 30.31 63.02 0.5954 0.2235

*: Trained on 3 tasks (segmentation, depth, and surface normal)
†: Trained on 2 tasks (segmentation and depth)
Italic: Reproduced by ourselves

Scores with models trained on 3 tasks for NYU dataset are shown only as reference.

Papers referred

MTAN: [paper][github]
KD4MTL: [paper][github]
PCGrad: [paper][github (tensorflow)][github (pytorch)]
AdaMT-Net: [paper]

Owner
Aki Nakano
Student at the University of Tokyo pursuing master's degree. Joined UC Berkeley Summer Session 2019. Researching deep learning. Python/R
Aki Nakano
PyTorch implementation of the NIPS-17 paper "Poincaré Embeddings for Learning Hierarchical Representations"

Poincaré Embeddings for Learning Hierarchical Representations PyTorch implementation of Poincaré Embeddings for Learning Hierarchical Representations

Facebook Research 1.6k Dec 25, 2022
LAnguage Model Analysis

LAMA: LAnguage Model Analysis LAMA is a probe for analyzing the factual and commonsense knowledge contained in pretrained language models. The dataset

Meta Research 960 Jan 08, 2023
PyTorch implementation of "VRT: A Video Restoration Transformer"

VRT: A Video Restoration Transformer Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc Van Gool Computer

Jingyun Liang 837 Jan 09, 2023
Deep Face Recognition in PyTorch

Face Recognition in PyTorch By Alexey Gruzdev and Vladislav Sovrasov Introduction A repository for different experimental Face Recognition models such

Alexey Gruzdev 141 Sep 11, 2022
Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

The Temporal Robustness of Stochastic Signals Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals" Case stud

0 Oct 28, 2021
State-to-Distribution (STD) Model

State-to-Distribution (STD) Model In this repository we provide exemplary code on how to construct and evaluate a state-to-distribution (STD) model fo

<a href=[email protected]"> 2 Apr 07, 2022
A keras implementation of ENet (abandoned for the foreseeable future)

ENet-keras This is an implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from ENet-training (lua-t

Pavlos 115 Nov 23, 2021
DANet for Tabular data classification/ regression.

Deep Abstract Networks A pyTorch implementation for AAAI-2022 paper DANets: Deep Abstract Networks for Tabular Data Classification and Regression. Bri

Ronnie Rocket 55 Sep 14, 2022
Demonstration of the Model Training as a CI/CD System in Vertex AI

Model Training as a CI/CD System This project demonstrates the machine model training as a CI/CD system in GCP platform. You will see more detailed wo

Chansung Park 19 Dec 28, 2022
Implementation of the paper titled "Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees"

Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees Implementation of the paper titled "Using Sampling to

MIDAS, IIIT Delhi 2 Aug 29, 2022
Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking."

Expert-Linking Pytorch implementation of the paper "COAD: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking." This is

BoChen 12 Jan 01, 2023
This repository contains the implementation of the paper Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans

Contrastive Instance Association for 4D Panoptic Segmentation using Sequences of 3D LiDAR Scans This repository contains the implementation of the pap

Photogrammetry & Robotics Bonn 40 Dec 01, 2022
Asynchronous Advantage Actor-Critic in PyTorch

Asynchronous Advantage Actor-Critic in PyTorch This is PyTorch implementation of A3C as described in Asynchronous Methods for Deep Reinforcement Learn

Reiji Hatsugai 38 Dec 12, 2022
AWS provides a Python SDK, "Boto3" ,which can be used to access the AWS-account from the local.

Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri

Shreyas Srivastava 1 Oct 25, 2021
The Turing Change Point Detection Benchmark: An Extensive Benchmark Evaluation of Change Point Detection Algorithms on real-world data

Turing Change Point Detection Benchmark Welcome to the repository for the Turing Change Point Detection Benchmark, a benchmark evaluation of change po

The Alan Turing Institute 85 Dec 28, 2022
Keras implementation of "One pixel attack for fooling deep neural networks" using differential evolution on Cifar10 and ImageNet

One Pixel Attack How simple is it to cause a deep neural network to misclassify an image if an attacker is only allowed to modify the color of one pix

Dan Kondratyuk 1.2k Dec 26, 2022
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and ap

3.4k Jan 04, 2023
The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Rohan Kasabe 4 Apr 05, 2022
This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

LIBRAS-Image-Classifier This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian

Aryclenio Xavier Barros 26 Oct 14, 2022
A demo of how to use JAX to create a simple gravity simulation

JAX Gravity This repo contains a demo of how to use JAX to create a simple gravity simulation. It uses JAX's experimental ode package to solve the dif

Cristian Garcia 16 Sep 22, 2022