Official implementation of our paper "Learning to Bootstrap for Combating Label Noise"

Related tags

Deep LearningL2B
Overview

Learning to Bootstrap for Combating Label Noise

This repo is the official implementation of our paper "Learning to Bootstrap for Combating Label Noise".

Citation

If you use this code for your research, please cite our paper "Learning to Bootstrap for Combating Label Noise".

@misc{zhou2022learning,
      title={Learning to Bootstrap for Combating Label Noise}, 
      author={Yuyin Zhou and Xianhang Li and Fengze Liu and Xuxi Chen and Lequan Yu and Cihang Xie and Matthew P. Lungren and Lei Xing},
      year={2022},
      eprint={2202.04291},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Requirements

Python >= 3.6.4
Pytorch >= 1.6.0
Higher = 0.2.1
Tensorboardx = 2.4.1

Training

First, please create a folder to store checkpoints by using the following command.

mkdir checkpoint

CIFAR-10

To reproduce the results on CIFAR dataset from our paper, please follow the command and our hyper-parameters.

First, you can adjust the corruption_prob and corruption_type to obtain different noise rates and noise type.

Second, the reweight_label indicates you are using the our L2B method. You can change it to baseline or mixup.

python  main.py  --arch res18 --dataset cifar10 --num_classes 10 --exp L2B --train_batch_size  512 \
 --corruption_prob 0.2 --reweight_label  --lr 0.15  -clipping_norm 0.25  --num_epochs 300  --scheduler cos \
 --corruption_type unif  --warm_up 10  --seed 0  

CIFAR-100

Most of settings are the same as CIFAR-10. To reproduce the results, please follow the command.

python  main.py  --arch res18 --dataset cifar100 --num_classes 100 --exp L2B --train_batch_size  256  \
--corruption_prob 0.2 --reweight_label  --lr 0.15  --clipping_norm 0.80  --num_epochs 300  --scheduler cos \
--corruption_type unif  --warm_up 10  --seed 0 \ 

ISIC2019

On the ISIC dataset, first you should download the dataset by following command.

Download ISIC dataset as follows:
wget https://isic-challenge-data.s3.amazonaws.com/2019/ISIC_2019_Training_Input.zip
wget https://isic-challenge-data.s3.amazonaws.com/2019/ISIC_2019_Training_GroundTruth.csv \

Then you can reproduce the results by following the command.

python main.py  --arch res50  --dataset ISIC --data_path isic_data/ISIC_2019_Training_Input --num_classes 8 
--exp L2B  --train_batch_size 64  --corruption_prob 0.2 --lr 0.01 --clipping_norm 0.80 --num_epochs 30 
--temperature 10.0  --wd 5e-4  --scheduler cos --reweight_label --norm_type softmax --warm_up 1 

Clothing-1M

First, the num_batch and train_batch_size indicates how many training images you want to use (we sample a balanced training data for each epoch).

Second, you can adjust the num_meta to sample different numbers of validation images to form the metaset. We use the whole validation set as metaset by default.

The data_path is where you store the data and key-label lists. And also change the data_path in the line 20 of main.py. If you have issue for downloading the dataset, please feel free to contact us.

Then you can reproduce the results by following the command.

python main.py --arch res18_224 --num_batch 250 --dataset clothing1m \
--exp L2B_clothing1m_one_stage_multi_runs  --train_batch_size 256  --lr 0.005  \
--num_epochs 300  --reweight_label  --wd 5e-4 --scheduler cos   --warm_up 0 \
--data_path /data1/data/clothing1m/clothing1M  --norm_type org  --num_classes 14 \ 
--multi_runs 3 --num_meta 14313

Contact

Yuyin Zhou

Xianhang Li

If you have any question about the code and data, please contact us directly.

Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Pop-Out Motion Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022) Jihyun Lee*, Minhyuk Sung*, Hyunjin Kim, Tae-Ky

Jihyun Lee 88 Nov 22, 2022
FluxTraining.jl gives you an endlessly extensible training loop for deep learning

A flexible neural net training library inspired by fast.ai

86 Dec 31, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 47 Sep 06, 2022
This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Towards Persona-Based Empathetic Conversational Models (PEC) This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (E

Zhong Peixiang 35 Nov 17, 2022
A modern pure-Python library for reading PDF files

pdf A modern pure-Python library for reading PDF files. The goal is to have a modern interface to handle PDF files which is consistent with itself and

6 Apr 06, 2022
DCGAN-tensorflow - A tensorflow implementation of Deep Convolutional Generative Adversarial Networks

DCGAN in Tensorflow Tensorflow implementation of Deep Convolutional Generative Adversarial Networks which is a stabilize Generative Adversarial Networ

Taehoon Kim 7.1k Dec 29, 2022
.NET bindings for the Pytorch engine

TorchSharp TorchSharp is a .NET library that provides access to the library that powers PyTorch. It is a work in progress, but already provides a .NET

Matteo Interlandi 17 Aug 30, 2021
functorch is a prototype of JAX-like composable function transforms for PyTorch.

functorch is a prototype of JAX-like composable function transforms for PyTorch.

Facebook Research 1.2k Jan 09, 2023
Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

Aditya Dutt 17 Nov 29, 2022
Open-World Entity Segmentation

Open-World Entity Segmentation Project Website Lu Qi*, Jason Kuen*, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia This projec

DV Lab 410 Jan 03, 2023
PyKaldi GOP-DNN on Epa-DB

PyKaldi GOP-DNN on Epa-DB This repository has the tools to run a PyKaldi GOP-DNN algorithm on Epa-DB, a database of non-native English speech by Spani

18 Dec 14, 2022
Release of the ConditionalQA dataset

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

14 Oct 17, 2022
Pyeventbus: a publish/subscribe event bus

pyeventbus pyeventbus is a publish/subscribe event bus for Python 2.7. simplifies the communication between python classes decouples event senders and

15 Apr 21, 2022
Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script.

clip-text-decoder Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script. Example Predi

Frank Odom 36 Dec 21, 2022
PanopticBEV - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images This r

63 Dec 16, 2022
This is a collection of our NAS and Vision Transformer work.

AutoML - Neural Architecture Search This is a collection of our AutoML-NAS work iRPE (NEW): Rethinking and Improving Relative Position Encoding for Vi

Microsoft 828 Dec 28, 2022
PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks

PyDEns PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks. With PyDEns one can solve PD

Data Analysis Center 220 Dec 26, 2022
Statistical and Algorithmic Investing Strategies for Everyone

Eiten - Algorithmic Investing Strategies for Everyone Eiten is an open source toolkit by Tradytics that implements various statistical and algorithmic

Tradytics 2.5k Jan 02, 2023
official code for dynamic convolution decomposition

Revisiting Dynamic Convolution via Matrix Decomposition (ICLR 2021) A pytorch implementation of DCD. If you use this code in your research please cons

Yunsheng Li 110 Nov 23, 2022
Embeddinghub is a database built for machine learning embeddings.

Embeddinghub is a database built for machine learning embeddings.

Featureform 1.2k Jan 01, 2023