Code for "The Intrinsic Dimension of Images and Its Impact on Learning" - ICLR 2021 Spotlight

Last update: Dec 10, 2022

Related tags

Overview

`dimensions`

Estimating the instrinsic dimensionality of image datasets

Code for: The Intrinsic Dimensionaity of Images and Its Impact On Learning - Phillip Pope and Chen Zhu, Ahmed Abdelkader, Micah Goldblum, Tom Goldstein (ICLR 2021, spotlight)

https://openreview.net/forum?id=XJk19XzGq2J

Environment

This code was developed in the following environment

conda create dimensions python=3.6 jupyter matplotlib scikit-learn pytorch==1.5.0 torchvision cudatoolkit=10.2 -c pytorch

To generate new data of controlled dimensionality with GANs, you must install:

pip install pytorch-pretrained-biggan

To use the shortest-path method (Granata and Carnevale 2016) you must also compile the fast graph shortest path code gsp (written by Jake VdP + Sci-Kit Learn)

cd estimators/gsp
python setup.py install

Generate data of controlled dimensionality

python generate_data/gen_images.py \
  --num_samples 1000 \
  --class_name basenji \
  --latent_dim 16 \
  --batch_size 100 \
  --save_dir samples/basenji_16

Estimate dimension of generated samples

To run the MLE (Levina and Bickel) estimator on the synthetic GAN data generated above:

python main.py \
    --estimator mle \
    --k1 25 \
    --single-k \
    --eval-every-k \
    --average-inverse \
    --dset  samples/basenji_16 \
    --max_num_samples 1000 \
    --save-path results/basenji_16.json

Use --estimators to try different estimators

Citation

If you find our paper or code useful, please cite our paper:

@inproceedings{DBLP:conf/iclr/PopeZAGG21,
  author    = {Phillip Pope and
               Chen Zhu and
               Ahmed Abdelkader and
               Micah Goldblum and
               Tom Goldstein},
  title     = {The Intrinsic Dimension of Images and Its Impact on Learning},
  booktitle = {9th International Conference on Learning Representations, {ICLR} 2021,
               Virtual Event, Austria, May 3-7, 2021},
  publisher = {OpenReview.net},
  year      = {2021},
  url       = {https://openreview.net/forum?id=XJk19XzGq2J},
  timestamp = {Wed, 23 Jun 2021 17:36:39 +0200},
  biburl    = {https://dblp.org/rec/conf/iclr/PopeZAGG21.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Acknowledgements

We gratefully acknowledge use of the following codebases when developing our dimensionality estimators:

We also thank Prof. Vishnu Boddeti for clarifying comments on the graph-distance estimator.

Disclaimer

This code released as is. We will do our best to address questions/bugs, but cannot guarantee support.

Code for "The Intrinsic Dimension of Images and Its Impact on Learning" - ICLR 2021 Spotlight

Related tags

Overview

`dimensions`

Environment

Generate data of controlled dimensionality

Estimate dimension of generated samples

Citation

Acknowledgements

Disclaimer

Owner

Phil Pope

Bounding Wasserstein distance with couplings

Face Library is an open source package for accurate and real-time face detection and recognition

[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required.

UniFormer - official implementation of UniFormer

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Auxiliary Raw Net (ARawNet) is a ASVSpoof detection model taking both raw waveform and handcrafted features as inputs, to balance the trade-off between performance and model complexity.

Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US simulation

Pytorch-Swin-Unet-V2 - a modified version of Swin Unet based on Swin Transfomer V2

Unofficial implementation of Pix2SEQ

Highly comparative time-series analysis

This library is a location of the LegacyLogger for PyTorch Lightning.

PyTorch implementation of our method for adversarial attacks and defenses in hyperspectral image classification.

This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Synthetic structured data generators

This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

Image Fusion Transformer

Simple embedding based text classifier inspired by fastText, implemented in tensorflow

Practical and Real-world applications of ML based on the homework of Hung-yi Lee Machine Learning Course 2021

Image-popularity-score - A novel deep regression method for image scoring.