Codebase for the paper titled "Continual learning with local module selection"

Related tags

Deep LearningLMC
Overview

This repository contains the codebase for the paper Continual Learning via Local Module Composition.


Setting up the environemnt

Create a new conda environment and install the requirements.

conda create --name ENV python=3.7
conda activate ENV
pip install -r requirements.txt
pip install -e Utils/ctrl/
pip install Utils/nngeometry/

CTrL Benchmark

All experiments were run on Nvidia Quadro RTX 8000 GPUs. To run CTrL experiments use the following comands for different streams:

Stream S-

LMC (task agnostic)

python main_transfer.py --activate_after_str_oh=0 --momentum_bn 0.1 --track_running_stats_bn 1 --pr_name lmc_cr --shuffle_test 0 --init_oh=none --task_sequence s_minus --momentum_bn_decoder=0.1 --activation_structural=sigmoid --deviation_threshold=4 --depth=4 --epochs=100 --fix_layers_below_on_addition=0 --hidden_size=64 --lr=0.001 --mask_str_loss=1 --module_init=mean --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=20 --reg_factor=10  --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --temp=0.1 --wdecay=0.001

(test acc. 0.6863, 15 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_minus --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 1e-3

(test acc. 0.667, 12 modules)

Stream S+

LMC

python main_transfer.py --activate_after_str_oh=0 --activation_structural=sigmoid --deviation_threshold=1.5 --early_stop_complete=0 --pr_name lmc_cr --epochs=100 --epochs_str_only_after_addition=1 --hidden_size=64 --init_oh=none --init_runingstats_on_addition=1 --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=most_likely --momentum_bn=0.1 --momentum_bn_decoder=0.1 --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=5 --reg_factor=10 --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --task_sequence=s_plus --temp=1 --wdecay=0.001

(test acc. 0.6244, 22 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_plus --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 1e-3 --regenerate_seed 0

(test acc. 0.609, 18 modules)

Stream Sin

LMC

python main_transfer.py --activate_after_str_oh=0 --momentum_bn 0.1 --track_running_stats_bn 1 --pr_name lmc_cr --shuffle_test 0 --init_oh=none --task_sequence s_in --momentum_bn_decoder=0.1 --activation_structural=sigmoid --deviation_threshold=4 --depth=4 --epochs=100 --fix_layers_below_on_addition=0 --hidden_size=64 --lr=0.001 --mask_str_loss=1 --module_init=most_likely --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=20 --reg_factor=10  --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --temp=0.1 --wdecay=0.001

(test acc. 0.7081, 21 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_in --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 1e-3 --regenerate_seed 0

(test acc. 0.6646, 15 modules)

Stream Sout

LMC

python main_transfer.py --activate_after_str_oh=0 --momentum_bn 0.1 --track_running_stats_bn 1 --pr_name lmc_cr --shuffle_test 0 --init_oh=none --task_sequence s_out --momentum_bn_decoder=0.1 --activation_structural=sigmoid --deviation_threshold=4 --depth=4 --epochs=100 --fix_layers_below_on_addition=0 --hidden_size=64 --lr=0.001 --mask_str_loss=1 --module_init=mean --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=20 --reg_factor=10  --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --temp=0.1 --wdecay=0.001

(test acc. 0.5849, 15 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_out --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 0 --regenerate_seed 0

(test acc. 0.6567, 11 modules)

Stream Spl

LMC

python main_transfer.py --activate_after_str_oh=0 --activation_structural=sigmoid --pr_name lmc_cr --deviation_threshold=1.5 --early_stop_complete=0 --epochs=100 --hidden_size=64 --init_oh=none --init_runingstats_on_addition=0 --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=most_likely --momentum_bn=0.1 --momentum_bn_decoder=0.1 --multihead=gated_linear --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=10 --reg_factor=10 --running_stats_steps=100 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --task_sequence=s_pl --temp=1 --regenerate_seed 0 --wdecay=0.001

(test acc. 0.6241, 19 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --pr_name lmc_cr --copy_batchstats 1 --track_running_stats_bn 1 --task_sequence s_pl --gating MNTDP --shuffle_test 0 --epochs 100 --lr 1e-3 --wdecay 1e-4 --regenerate_seed 0

(test acc. 0.6391, 18 modules)


Stream Slong30 -- 30 tasks

LMC (task aware)

python main_transfer.py --activate_after_str_oh=0 --activation_structural=sigmoid --deviation_threshold=1.5 --epochs=50 --hidden_size=64 --init_oh=none --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=most_likely --momentum_bn_decoder=0.1 --multihead=gated_linear --n_tasks=100 --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=5 --reg_factor=1 --running_stats_steps=50 --seed=180 --str_prior_factor=1 --str_prior_temp=0.01 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=0 --task_sequence=s_long30 --temp=1 --wdecay=0.001

(test acc. 62.44, 50 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --epochs=50 --hidden_size=64 --lr=0.001 --module_init=most_likely --multihead=gated_linear --n_tasks=100 --seed=180 --task_sequence=s_long30 --wdecay=0.001

(test acc. 64.58, 64 modules)


Stream Slong -- 100 tasks

LMC (task aware)

python main_transfer.py --activate_after_str_oh=0 --activation_structural=sigmoid --deviation_threshold=4 --epochs=100 --hidden_size=64 --init_oh=none --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=most_likely --momentum_bn_decoder=0.1 --multihead=gated_linear --n_tasks=100 --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=5 --reg_factor=1 --running_stats_steps=50 --seed=180 --str_prior_factor=1 --str_prior_temp=0.01 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=0 --task_sequence=s_long --temp=1 --pr_name s_long_cr --wdecay=0

(test acc. 63.88, 32 modules)

MNTDP (task aware)

python main_transfer_mntdp.py --momentum_bn 0.1 --n_tasks 100 --hidden_size 64 --searchspace topdown --keep_bn_in_eval_after_freeze 1 --pr_name s_long_cr --copy_batchstats 1 --track_running_stats_bn 1 --wand_notes correct_MNTDP --task_sequence s_long --gating MNTDP --shuffle_test 0 --epochs 50 --lr 1e-3 --wdecay 1e-3

(test acc. 68.92, 142 modules)


OOD generalization experiments

LMC

python main_transfer.py --regenerate_seed 0 --deviation_threshold=8 --epochs=50 --pr_name lmc_cr --hidden_size=64 --keep_bn_in_eval_after_freeze=0 --lr=0.001 --module_init=none --momentum_bn_decoder=0.1 --normalize_data=1 --optmize_structure_only_free_modules=0 --projection_phase_length=10 --no_projection_phase 0 --reg_factor=10 --running_stats_steps=1000 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=linear_no_act --task_sequence=s_ood --temp=1 --wdecay=0 --task_agnostic_test=0

EWC

python main_transfer.py --epochs=50 --ewc=1000 --hidden_size=256 --keep_bn_in_eval_after_freeze=0 --lr=0.001 --module_init=none --pr_name lmc_cr --multihead=usual --normalize_data=1  --task_sequence=s_ood --use_structural=0 --wdecay=0 --projection_phase_length=0

MNTDP

python main_transfer_mntdp.py --epochs=50 --regenerate_seed 0 --hidden_size=64 --keep_bn_in_eval_after_freeze=0 --pr_name lmc_cr --lr=0.01 --module_init=none --multihead=usual --normalize_data=1 --task_sequence=s_ood --use_structural=0 --wdecay=0

LMC (no projetion)

python main_transfer.py --regenerate_seed 0 --deviation_threshold=8 --epochs=50 --pr_name lmc_cr --hidden_size=64 --keep_bn_in_eval_after_freeze=0 --lr=0.001 --module_init=none --momentum_bn_decoder=0.1 --normalize_data=1 --optmize_structure_only_free_modules=0 --projection_phase_length=0 --no_projection_phase 1 --reg_factor=10 --running_stats_steps=1000 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=linear_no_act --task_sequence=s_ood --temp=1 --wdecay=0

Plug and play (combining independently trained modular learners)

python main_plug_and_play.py --activate_after_str_oh=0 --activation_structural=sigmoid --deviation_threshold=1.5 --early_stop_complete=0 --epochs=100 --epochs_str_only_after_addition=1 --pr_name lmc_cr --hidden_size=64 --init_oh=none --init_runingstats_on_addition=1 --keep_bn_in_eval_after_freeze=1 --lr=0.001 --module_init=mean --momentum_bn=0.1 --momentum_bn_decoder=0.1 --multihead=gated_linear --n_tasks=3 --normalize_oh=1 --optmize_structure_only_free_modules=1 --projection_layer_oh=0 --projection_phase_length=5 --reg_factor=10 --running_stats_steps=10 --str_prior_factor=1 --str_prior_temp=0.1 --structure_inv=ae --structure_inv_oh=linear_no_act --task_agnostic_test=1 --task_sequence=s_pnp_comp --temp=1 --wdecay=0.001

A list of hyperparameters used for other baselines can be found in the baselines.txt file.


References

Owner
Oleksiy Ostapenko
Oleksiy Ostapenko
Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation

PLOP: Learning without Forgetting for Continual Semantic Segmentation This repository contains all of our code. It is a modified version of Cermelli e

Arthur Douillard 116 Dec 14, 2022
Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

Hierarchical Pretraining: Research Repository This is a research repository for reproducing the results from the project "Self-supervised pretraining

Colorado Reed 53 Nov 09, 2022
Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis

Introduction This is an implementation of our paper Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis.

24 Dec 06, 2022
Video-Music Transformer

VMT Video-Music Transformer (VMT) is an attention-based multi-modal model, which generates piano music for a given video. Paper https://arxiv.org/abs/

Chin-Tung Lin 5 Jul 13, 2022
Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning

SkFlow has been moved to Tensorflow. SkFlow has been moved to http://github.com/tensorflow/tensorflow into contrib folder specifically located here. T

3.2k Dec 29, 2022
Notebooks em Python para Métodos Eletromagnéticos

GeoSci Labs This is a repository of code used to power the notebooks and interactive examples for https://em.geosci.xyz and https://gpg.geosci.xyz. Th

Victor Cezar Tocantins 1 Nov 16, 2021
A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

rydercalmdown 22 Apr 15, 2022
A PaddlePaddle version image model zoo.

Paddle-Image-Models English | 简体中文 A PaddlePaddle version image model zoo. Install Package Install by pip: $ pip install ppim Install by wheel package

AgentMaker 131 Dec 07, 2022
deep-prae

Deep Probabilistic Accelerated Evaluation (Deep-PrAE) Our work presents an efficient rare event simulation methodology for black box autonomy using Im

Safe AI Lab 4 Apr 17, 2021
Code repo for "FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation" (ICCV 2021)

FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation (ICCV 2021) This repository contains the implementation of th

Yuhang Zang 21 Dec 17, 2022
Alphabetical Letter Recognition

DecisionTrees-Image-Classification Alphabetical Letter Recognition In these demo we are using "Decision Trees" Our database is composed by Learning Im

Mohammed Firass 4 Nov 30, 2021
A Free and Open Source Python Library for Multiobjective Optimization

Platypus What is Platypus? Platypus is a framework for evolutionary computing in Python with a focus on multiobjective evolutionary algorithms (MOEAs)

Project Platypus 424 Dec 18, 2022
Face Mask Detection on Image and Video using tensorflow and keras

Face-Mask-Detection Face Mask Detection on Image and Video using tensorflow and keras Train Neural Network on face-mask dataset using tensorflow and k

Nahid Ebrahimian 12 Nov 11, 2022
Title: Graduate-Admissions-Predictor

The purpose of this project is create a predictive model capable of identifying the probability of a person securing an admit based on their personal profile parameters. Simplified visualisations hav

Akarsh Singh 1 Jan 26, 2022
Moving Object Segmentation in 3D LiDAR Data: A Learning-based Approach Exploiting Sequential Data

LiDAR-MOS: Moving Object Segmentation in 3D LiDAR Data This repo contains the code for our paper: Moving Object Segmentation in 3D LiDAR Data: A Learn

Photogrammetry & Robotics Bonn 394 Dec 29, 2022
Official implementation of "Learning Proposals for Practical Energy-Based Regression", 2021.

ebms_proposals Official implementation (PyTorch) of the paper: Learning Proposals for Practical Energy-Based Regression, 2021 [arXiv] [project]. Fredr

Fredrik Gustafsson 10 Oct 22, 2022
A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

来自丹麦的天籁 10 Dec 06, 2022
Code accompanying paper: Meta-Learning to Improve Pre-Training

Meta-Learning to Improve Pre-Training This folder contains code to run experiments in the paper Meta-Learning to Improve Pre-Training, NeurIPS 2021. P

28 Dec 31, 2022
Individual Treatment Effect Estimation

CAPE Individual Treatment Effect Estimation Run CAPE python train_causal.py --loop 10 -m cape_cau -d NI --i_t 1 Run a baseline model python train_cau

S. Deng 4 Sep 02, 2022
Gapmm2: gapped alignment using minimap2 (align transcripts to genome)

gapmm2: gapped alignment using minimap2 This tool is a wrapper for minimap2 to r

Jon Palmer 2 Jan 27, 2022