Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion

Overview

Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion

Preface

This directory provides an implementation of the algorithms used to compute the hypergeometric tail pseudo-inverse, as well as the code used to produce all figures of the paper "Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion" by Leboeuf, LeBlanc and Marchand.

Installation

To run the scripts, one must first install the package and its requirements. To do so, run the following command from the root directory:

pip install .

Doing so will also provide you with the package hypergeo, which implements an algorithm to compute the hypergeometric tail pseudo-inverses.

Requirements

The code was written to run on Python 3.8 or more recent version. The requirements are shown in the file requirements.txt and can be installed using the command:

pip install -r requirements.txt

The code

The code is split into 2 parts: the 'hypergeo' package and the 'scripts' directory.

The hypergeo package implements the utilities regarding the hypergeometric distribution (to compute the tail and its inverse), the binomial distribution (reimplementing the inverse as the scipy version suffered from numerical unstabilities) and some generalization bounds.

The scripts files produce the figures found in the paper using the hypergeo package. All figures are generated directly in LaTeX using the package python2latex. To run a script, navigate from the command line to the directory root directory of the project and run the command

/ .py" ">
python "./scripts/
     
      /
      
       .py"

      
     

The code does not provide command line control on the parameters of each script. However, each script is fairly simple, and parameters can be directly changed in the __main__ part of the script.

Scripts used in the body of the paper

  • Section 3.3: The ghost sample trade-off. In this section, we claim that optimizing m' gives relative gain between 8% and 10%. To obtain these number, you need to run the file mprime_tradeoff/generate_mprime_data.py to first generate the data, and then run mprime_tradeoff/stats.py.

  • Section 5: Numerical comparison. Figure 1a and 1b are obtain by executing the scripts bounds_comparison/bounds_comparison_risk.py and bounds_comparison/bounds_comparison_d.py respectively. Figure 2a and 2b are obtain by executing the scripts bounds_comparison/bounds_comparison_m.py, the first setting the variable risk to 0, the second by setting it equal to 0.1.

Scripts used in the appendices of the paper

  • Appendix B: Overview of the hypergeometric distribution. Figure 3 is generated from hypergeometric_tail/hyp_tail_plot.py. Figure 4 is generated from hypergeometric_tail/hyp_tail_inv_plot.py. Algorithm 1 is implemented in the hypergeo file hypergeo/hypergeometric_distribution.py as the function hypergeometric_tail_inverse. Algorithm 2 is implemented in the hypergeo file hypergeo/hypergeometric_distribution.py as the function berkopec_hypergeometric_tail_inverse.

  • Appendix D: In-depth analysis of the ghost sample trade-off. Figure 5 is generated from mprime_tradeoff/plot_epsilon_comp.py. Figure 6 is generated from mprime_tradeoff/plot_mprime_best.py.

  • Appendix E: The hypergeometric tail inversion relative deviation bound. To generate Figure 7 and 8, you must first run the file relative_deviation_mprime_tradeoff/mprime_tradeoff_relative_deviation.py to generate the data, then run the script relative_deviation_mprime_tradeoff/plot_epsilon_comp.py to produce Figure 7 and relative_deviation_comparison/plot_mprime_best.py to produce Figure 8.

  • Appendix G: The hypergeometric tail lower bound . Figure 9 is generated from lower_bound/lower_bound_comparison_risk.py.

  • Appendix F: Further numerical comparisons. Figure 10 and 12a are generated from bounds_comparison/bounds_comparison_risk.py by changing the parameters of the scripts. Figure 11 and 12b is generated from bounds_comparison/bounds_comparison_m.py by changing the parameters of the scripts. Figure 13a and 13b are generated from bounds_comparison/sample_compression_comparison_risk.py and bounds_comparison/sample_compression_comparison_m.py respectively.

Other

The script pseudo-inverse_benchmarking/pseudo-inverse_benchmarking.py benchmarks the various algorithms used to invert the hypergeometric tail. The 'tests' directory contains unit tests using the package pytest.

Owner
Jean-Samuel Leboeuf
PhD candidate in Computer Sciences (Machine Learning). MSc in Theoretical Physics.
Jean-Samuel Leboeuf
Implicit Graph Neural Networks

Implicit Graph Neural Networks This repository is the official PyTorch implementation of "Implicit Graph Neural Networks". Fangda Gu*, Heng Chang*, We

Heng Chang 48 Nov 29, 2022
Multi-layer convolutional LSTM with Pytorch

Convolution_LSTM_pytorch Thanks for your attention. I haven't got time to maintain this repo for a long time. I recommend this repo which provides an

Zijie Zhuang 733 Dec 30, 2022
Code for the ICCV2021 paper "Personalized Image Semantic Segmentation"

PSS: Personalized Image Semantic Segmentation Paper PSS: Personalized Image Semantic Segmentation Yu Zhang, Chang-Bin Zhang, Peng-Tao Jiang, Ming-Ming

张宇 15 Jul 09, 2022
Few-shot NLP benchmark for unified, rigorous eval

FLEX FLEX is a benchmark and framework for unified, rigorous few-shot NLP evaluation. FLEX enables: First-class NLP support Support for meta-training

AI2 85 Dec 03, 2022
Bayesian Deep Learning and Deep Reinforcement Learning for Object Shape Error Response and Correction of Manufacturing Systems

Bayesian Deep Learning for Manufacturing 2.0 (dlmfg) Object Shape Error Response (OSER) Digital Lifecycle Management - In Process Quality Improvement

Sumit Sinha 30 Oct 31, 2022
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Soubhik Sanyal 689 Dec 25, 2022
Google Landmark Recogntion and Retrieval 2021 Solutions

Google Landmark Recogntion and Retrieval 2021 Solutions In this repository you can find solution and code for Google Landmark Recognition 2021 and Goo

Vadim Timakin 5 Nov 25, 2022
SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages

SASM (SimpleASM) - простая кроссплатформенная среда разработки для языков ассемблера NASM, MASM, GAS, FASM с подсветкой синтаксиса и отладчиком. В SA

Dmitriy Manushin 5.6k Jan 06, 2023
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 08, 2023
Collection of common code that's shared among different research projects in FAIR computer vision team.

fvcore fvcore is a light-weight core library that provides the most common and essential functionality shared in various computer vision frameworks de

Meta Research 1.5k Jan 07, 2023
Fang Zhonghao 13 Nov 19, 2022
Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them

TensorFlow Serving + Streamlit! ✨ 🖼️ Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them! This is a pretty simple S

Álvaro Bartolomé 18 Jan 07, 2023
This repo implements several applications of the proposed generalized Bures-Wasserstein (GBW) geometry on symmetric positive definite matrices.

GBW This repo implements several applications of the proposed generalized Bures-Wasserstein (GBW) geometry on symmetric positive definite matrices. Ap

Andi Han 0 Oct 22, 2021
N-gram models- Unsmoothed, Laplace, Deleted Interpolation

N-gram models- Unsmoothed, Laplace, Deleted Interpolation

Ravika Nagpal 1 Jan 04, 2022
Library to enable Bayesian active learning in your research or labeling work.

Bayesian Active Learning (BaaL) BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components

ElementAI 687 Dec 25, 2022
This is the official PyTorch implementation of our paper: "Artistic Style Transfer with Internal-external Learning and Contrastive Learning".

Artistic Style Transfer with Internal-external Learning and Contrastive Learning This is the official PyTorch implementation of our paper: "Artistic S

51 Dec 20, 2022
Learn about quantum computing and algorithm on quantum computing

quantum_computing this repo contains everything i learn about quantum computing and algorithm on quantum computing what is aquantum computing quantum

arfy slowy 8 Dec 25, 2022
EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos. In this project, we provide the basic code for fitt

ZJU3DV 2.2k Jan 05, 2023
CBKH: The Cornell Biomedical Knowledge Hub

Cornell Biomedical Knowledge Hub (CBKH) CBKG integrates data from 18 publicly available biomedical databases. The current version of CBKG contains a t

44 Dec 21, 2022
Source code of AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Towards End-to-End Image Compression and Analysis with Transformers Source code of our AAAI 2022 paper "Towards End-to-End Image Compression and Analy

37 Dec 21, 2022