This repo will contain code to reproduce and build upon understanding transfer learning

Overview

What is being transferred in transfer learning?

This repo contains the code for the following paper:

Behnam Neyshabur*, Hanie Sedghi*, Chiyuan Zhang*. What is being transferred in transfer learning?. *equal contribution. Advances in Neural Information Processing Systems (NeurIPS), 2020.

Disclaimer: this is not an officially supported Google product.

Setup

Library dependencies

This code has the following dependencies

  • pytorch (1.4.0 is tested)
  • gin-config
  • tqdm
  • wget (the python package)

GPUs are needed to run most of the experiments.

Data

CheXpert data (the train and valid folders) needs to be placed in /mnt/data/CheXpert-v1.0-img224. If your data is in a different place, you can specify the data.image_path parameter (see configs/p100_chexpert.py). We pre-resized all the CheXpert images to reduce the burden of data pre-processing using the following script:

'" ../$NEWDIR/{} cd .. ">
#!/bin/bash

NEWDIR=CheXpert-v1.0-img224
mkdir -p $NEWDIR/{train,valid}

cd CheXpert-v1.0

echo "Prepare directory structure..."
find . -type d | parallel mkdir -p ../$NEWDIR/{}

echo "Resize all images to have at least 224 pixels on each side..."
find . -name "*.jpg" | parallel convert {} -resize "'224^>'" ../$NEWDIR/{}

cd ..

The DomainNet data will be automatically downloaded from the Internet upon first run. By default, it will download to /mnt/data, which can be changed with the data_dir config (see configs/p100_domain_net.py).

Common Experiments

Training jobs

CheXpert training from random init. We use 2 Nvidia V100 GPUs for CheXpert training. If you run into out-of-memory error, you can try to reduce the batch size.

CUDA_VISIBLE_DEVICES=0,1 python chexpert_train.py -k train/chexpert/fixup_resnet50_nzfc/randinit-lr0.1-bs256

CheXpert finetuning from ImageNet pre-trained checkpoint. The code tries to load the ImageNet pre-trained chexpoint from /mnt/data/logs/imagenet-lr01/ckpt-E090.pth.tar. Or you can customize the path to checkpoint (see configs/p100_chexpert.py).

CUDA_VISIBLE_DEVICES=0,1 python chexpert_train.py -k train/chexpert/fixup_resnet50_nzfc/finetune-lr0.02-bs256

Similarly, DomainNet training can be executed using the script imagenet_train.py (replace real with clipart and quickdraw to run on different domains).

# randinit
CUDA_VISIBLE_DEVICES=0 python imagenet_train.py -k train/DomainNet_real/fixup_resnet50_nzfc/randinit-lr0.1-MstepLR

# finetune
CUDA_VISIBLE_DEVICES=0 python imagenet_train.py -k train/DomainNet_real/fixup_resnet50_nzfc/finetune-lr0.02-MstepLR

Training with shuffled blocks

The training jobs with block-shuffled images are defined in configs/p200_pix_shuffle.py. Run

python -m configs pix_shuffle

To see the keys of all the training jobs with pixel shuffling. Similarly,

python -m configs blk7_shuffle

list all the jobs with 7x7 block-shuffled images. You can run any of those jobs using the -k command line argument. For example:

CUDA_VISIBLE_DEVICES=0 python imagenet_train.py \
    -k blk7_shuffle/DomainNet_quickdraw/fixup_resnet50_nzfc_noaug/randinit-lr0.1-MstepLR/seed0

Finetuning from different pre-training checkpoints

The config file configs/p200_finetune_ckpt.py defines training jobs that finetune from different ImageNet pre-training checkpoints along the pre-training optimization trajectory.

Linear interpolation between checkpoints (performance barrier)

The script ckpt_interpolation.py performs the experiment of linearly interpolating between different solutions. The file is self-contained. You can edit the file directly to specify which combinations of checkpoints are to be used. The command line argument -a compute and -a plot can be used to switch between doing the computation and making the plots based on computed results.

General Documentation

This codebase uses gin-config to customize the behavior of the program, and allows us to easily generate a large number of similar configurations with Python loops. This is especially useful for hyper-parameter sweeps.

Running a job

A script mainly takes a config key in the commandline, and it will pull the detailed configurations according to this key from the pre-defined configs. For example:

python3 imagenet_train.py -k train/cifar10/fixup_resnet50/finetune-lr0.02-MstepLR

Query pre-defined configs

You can list all the pre-defined config keys matching a given regex with the following command:

python3 -m configs 

For example:

$ python3 -m configs cifar10
2 configs found ====== with regex: cifar10
    0) train/cifar10/fixup_resnet50/randinit-lr0.1-MstepLR
    1) train/cifar10/fixup_resnet50/finetune-lr0.02-MstepLR

Defining new configs

All the configs are in the directory configs, with the naming convention pXXX_YYY.py. Here XXX are digits, which allows ordering between configs (so when defining configs we can reference and extend previously defined configs).

To add a new config file:

  1. create pXXX_YYY.py file.
  2. edit __init__.py to import this file.
  3. in the newly added file, define functions to registery new configs. All the functions with the name register_blah will be automatically called.

Customing new functions

To customize the behavior of a new function, make that function gin configurable by

@gin.configurable('config_name')
def my_func(arg1=gin.REQUIRED, arg2=0):
  # blah

Then in the pre-defined config files, you can specify the values by

spec['gin']['config_name.arg1'] = # whatever python objects
spec['gin']['config_name.arg2'] = 2

See gin-config for more details.

Official implementation of the NRNS paper: No RL, No Simulation: Learning to Navigate without Navigating

No RL No Simulation (NRNS) Official implementation of the NRNS paper: No RL, No Simulation: Learning to Navigate without Navigating NRNS is a heriarch

Meera Hahn 20 Nov 29, 2022
Code and datasets for TPAMI 2021

SkeletonNet This repository constains the codes and ShapeNetV1-Surface-Skeleton,ShapNetV1-SkeletalVolume and 2d image datasets ShapeNetRendering. Plea

34 Aug 15, 2022
Task-related Saliency Network For Few-shot learning

Task-related Saliency Network For Few-shot learning This is an official implementation in Tensorflow of TRSN. Abstract An essential cue of human wisdo

1 Nov 18, 2021
Explainable Zero-Shot Topic Extraction

Zero-Shot Topic Extraction with Common-Sense Knowledge Graph This repository contains the code for reproducing the results reported in the paper "Expl

D2K Lab 56 Dec 14, 2022
Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments Paper: arXiv (ICRA 2021) Video : https://youtu.be/CC

Sachini Herath 68 Jan 03, 2023
Automatic Differentiation Multipole Moment Molecular Forcefield

Automatic Differentiation Multipole Moment Molecular Forcefield Performance notes On a single gpu, using waterbox_31ang.pdb example from MPIDplugin wh

4 Jan 07, 2022
Pytorch implementation of VAEs for heterogeneous likelihoods.

Heterogeneous VAEs Beware: This repository is under construction 🛠️ Pytorch implementation of different VAE models to model heterogeneous data. Here,

Adrián Javaloy 35 Nov 29, 2022
PyTorch implementation for our paper "Deep Facial Synthesis: A New Challenge"

FSGAN Here is the official PyTorch implementation for our paper "Deep Facial Synthesis: A New Challenge". This project achieve the translation between

Deng-Ping Fan 32 Oct 10, 2022
Apache Flink

Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin

The Apache Software Foundation 20.4k Dec 30, 2022
Fairness Metrics: All you need to know

Fairness Metrics: All you need to know Testing machine learning software for ethical bias has become a pressing current concern. Recent research has p

Anonymous2020 1 Jan 17, 2022
PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

48 Dec 08, 2022
Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

LEYA 13 Nov 30, 2022
A small fun project using python OpenCV, mediapipe, and pydirectinput

Here I tried a small fun project using python OpenCV, mediapipe, and pydirectinput. Here we can control moves car game when yellow color come to right box (press key 'd') left box (press key 'a') lef

Sameh Elisha 3 Nov 17, 2022
Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

Intelligent Systems Lab Org 1.3k Jan 02, 2023
[ICML 2022] The official implementation of Graph Stochastic Attention (GSAT).

Graph Stochastic Attention (GSAT) The official implementation of GSAT for our paper: Interpretable and Generalizable Graph Learning via Stochastic Att

85 Nov 27, 2022
Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption

SG-GAN TensorFlow implementation of SG-GAN. Prerequisites TensorFlow (implemented in v1.3) numpy scipy pillow Getting Started Train Prepare dataset. W

lplcor 61 Jun 07, 2022
Machine learning Bot detection technique, based on United States election dataset

Machine learning Bot detection technique, based on United States election dataset (2020). Current github repo provides implementation described in pap

Alexander Shevtsov 4 Nov 20, 2022
Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution Figure: Example visualization of the method and baseline as a

Oliver Hahn 16 Dec 23, 2022
Deep Learning and Logical Reasoning from Data and Knowledge

Logic Tensor Networks (LTN) Logic Tensor Network (LTN) is a neurosymbolic framework that supports querying, learning and reasoning with both rich data

171 Dec 29, 2022
商品推荐系统

商品top50推荐系统 问题建模 本项目的数据集给出了15万左右的用户以及12万左右的商品, 以及对应的经过脱敏处理的用户特征和经过预处理的商品特征,旨在为用户推荐50个其可能购买的商品。 推荐系统架构方案 本项目采用传统的召回+排序的方案。

107 Dec 29, 2022