Official implementation for "Symbolic Learning to Optimize: Towards Interpretability and Scalability"

Overview

Symbolic Learning to Optimize

This is the official implementation for ICLR-2022 paper "Symbolic Learning to Optimize: Towards Interpretability and Scalability"

Introduction

Recent studies on Learning to Optimize (L2O) suggest a promising path to automating and accelerating the optimization procedure for complicated tasks. Existing L2O models parameterize optimization rules by neural networks, and learn those numerical rules via meta-training. However, they face two common pitfalls: (1) scalability: the numerical rules represented by neural networks create extra memory overhead for applying L2O models, and limits their applicability to optimizing larger tasks; (2) interpretability: it is unclear what each L2O model has learned in its black-box optimization rule, nor is it straightforward to compare different L2O models in an explainable way. To avoid both pitfalls, this paper proves the concept that we can ``kill two birds by one stone'', by introducing the powerful tool of symbolic regression to L2O. In this paper, we establish a holistic symbolic representation and analysis framework for L2O, which yields a series of insights for learnable optimizers. Leveraging our findings, we further propose a lightweight L2O model that can be meta-trained on large-scale problems and outperformed human-designed and tuned optimizers. Our work is set to supply a brand-new perspective to L2O research.

Our approach:

First train a neural network (LSTM) based optimizer, then leverage the symbolic regression tool to trouble shoot and analyze the neural network based optimizer. The yielded symbolic rule serve as a light weight light-weight surrogate of the original optimizer.

Our main findings:

Example of distilled equations from DM model:

Example of distilled equations from RP model (they are simpler than the DM surrogates, and yet more effective for the optimization task):

Distilled symbolic rules fit the optimizer quite well:

The distilled symbolic rule and underlying rules

Distilled symbolic rules perform same optimization task well, compared with the original numerical optimizer:

The light weight symbolic rules are able to be meta-tuned on large scale (ResNet-50) optimizee and get good performance:

ss large scale optimizee

The symbolic regression passed the sanity checks in the optimization tasks:

Installation Guide

The installation require no special packages. The tensorflow version we adoped is 1.14.0, and the PyTorch version we adopted is 1.7.1.

Training Guide

The three files:

torch-implementation/l2o_train_from_scratch.py

torch-implementation/l2o_symbolic_regression_stage_2_3.py

torch-implementation/l2o_evaluation.py

are pipline scripts, which integrate the multi-stage experiments. The detailed usages are specified within these files. We offer several examples below.

  • In order to train a rnn-prop model from scratch on mnist classification problem setting with 200 epochs, each epoch with length 200, unroll length 20, batch size 128, learning rate 0.001 on GPU-0, run:

    python l2o_train_from_scratch.py -m tras -p mni -n 200 -l 200 -r 20 -b 128 -lr 0.001 -d 0

  • In order to fine-tune an L2O model on the CNN optimizee with 200 epochs, each epoch length 1000, unroll length 20, batch size 64, learning rate 0.001 on GPU-0, first put the .pth model checkpoint file (the training script above will automatically save it in a new folder under current directory) under the first (0-th, in the python index) location in __WELL_TRAINED__ specified in torch-implementation/utils.py , then run the following script:

    python l2o_train_from_scratch.py -m tune -pr 0 -p cnn -n 200 -l 1000 -r 20 -b 64 -lr 0.001 -d 0

  • In order to generate data for symbolic regression, if desire to obtain 50000 samples evaluated on MNIST classification problem, with optimization trajectory length of 300 steps, using GPU-3, then run:

    python l2o_evaluation.py -m srgen -p mni -l 300 -s 50000 -d 3

  • In order to distill equation from the previously saved offline SR dataset, check and run: torch-implementation/sr_train.py

  • In order to fine-tune SR equation, check and run: torch-implementation/stage023_mid2021_update.py

  • In order to convert distilled symbolic equation into latex readable form, check and run: torch-implementation/sr_test_get_latex.py.py

  • In order to calculate how good the symbolic is fitting the original model, we use the R2-scores; to compute it, check and run: torch-implementation/sr_test_cal_r2.py

  • In order to train and run the resnet-class optimizees, check and run: torch-implementation/run_resnet.py

There are also optional tensorflow implementations of L2O, including meta-training the two benchmarks used in this paper: DM and Rnn-prop L2O. However, all steps before generating offline datasets in the pipline is only supportable with torch implementations. To do symbolic regression with tensorflow implementation, you need to manually generate records (an .npy file) of shape [N_sample, num_feature+1], which concatenate the num_feature dimensional x (symbolic regresison input) and 1 dimensional y (output), containing N_sample samples. Once behavior dataset is ready, the following steps can be shared with torch implementation.

  • In order to train the tensorflow implementation of L2O, check and run: tensorflow-implementation/train_rnnprop.py, tensorflow-implementation/train_dm.py

  • In order to evaluate the tensorflow implementation of L2O and generate offline dataset for symbolic regression, check and run: tensorflow-implementation/evaluate_rnnprop.py, tensorflow-implementation/evaluate_dm.py.

Other hints

Meta train the DM/RP/RP_si models

run the train_optimizer() functionin torch-implementation/meta.py

Evaluate the optimization performance:

run theeva_l2o_optimizer() function in torch-implementation/meta.py

RP model implementations:

TheRPOptimizer in torch-implementation/meta.py

RP_si model implementations:

same as RP, set magic=0; or more diverse input can be enabled by setting grad_features="mt+gt+mom5+mom99"

DM model implementations:

DMOptimizer in torch-implementation/utils.py

SR implementations:

torch-implementation/sr_train.py

torch-implementation/sr_test_cal_r2.py

torch-implementation/sr_test_get_latex.py

other SR options and the workflow:

srUtils.py

Citation

comming soon.

Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
Our solution for SSN Invente 2021's Hackathon

Our solution for SSN Invente 2021's Hackathon. To help maitain godowns in a pristine and safe condition using raspberry pi.

1 Jan 12, 2022
Code for paper entitled "Improving Novelty Detection using the Reconstructions of Nearest Neighbours"

NLN: Nearest-Latent-Neighbours A repository containing the implementation of the paper entitled Improving Novelty Detection using the Reconstructions

Michael (Misha) Mesarcik 4 Dec 14, 2022
Rapid experimentation and scaling of deep learning models on molecular and crystal graphs.

LitMatter A template for rapid experimentation and scaling deep learning models on molecular and crystal graphs. How to use Clone this repository and

Nathan Frey 32 Dec 06, 2022
Advances in Neural Information Processing Systems (NeurIPS), 2020.

What is being transferred in transfer learning? This repo contains the code for the following paper: Behnam Neyshabur*, Hanie Sedghi*, Chiyuan Zhang*.

Google Research 36 Aug 26, 2022
PyTorch code to run synthetic experiments.

Code repository for Invariant Risk Minimization Source code for the paper: @article{InvariantRiskMinimization, title={Invariant Risk Minimization}

Facebook Research 345 Dec 12, 2022
OpenMMLab Pose Estimation Toolbox and Benchmark.

Introduction English | 简体中文 MMPose is an open-source toolbox for pose estimation based on PyTorch. It is a part of the OpenMMLab project. The master b

OpenMMLab 2.8k Dec 31, 2022
Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations

HierarchicyBandit Introduction This is the implementation of WSDM 2022 paper : Show Me the Whole World: Towards Entire Item Space Exploration for Inte

yu song 5 Sep 09, 2022
A fuzzing framework for SMT solvers

yinyang A fuzzing framework for SMT solvers. Given a set of seed SMT formulas, yinyang generates mutant formulas to stress-test SMT solvers. yinyang c

Project Yin-Yang for SMT Solver Testing 145 Jan 04, 2023
Stochastic Extragradient: General Analysis and Improved Rates

Stochastic Extragradient: General Analysis and Improved Rates This repository is the official implementation of the paper "Stochastic Extragradient: G

Hugo Berard 4 Nov 11, 2022
Capsule endoscopy detection DACON challenge

capsule_endoscopy_detection (DACON Challenge) Overview Yolov5, Yolor, mmdetection기반의 모델을 사용 (총 11개 모델 앙상블) 모든 모델은 학습 시 Pretrained Weight을 yolov5, yolo

MAILAB 11 Nov 25, 2022
Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study Supplementary Materials for Kentaro Matsuura, Junya Honda, Imad

Kentaro Matsuura 4 Nov 01, 2022
The source code for 'Noisy-Labeled NER with Confidence Estimation' accepted by NAACL 2021

Kun Liu*, Yao Fu*, Chuanqi Tan, Mosha Chen, Ningyu Zhang, Songfang Huang, Sheng Gao. Noisy-Labeled NER with Confidence Estimation. NAACL 2021. [arxiv]

30 Nov 12, 2022
GPU-accelerated Image Processing library using OpenCL

pyclesperanto pyclesperanto is a python package for clEsperanto - a multi-language framework for GPU-accelerated image processing. clEsperanto uses Op

17 Dec 25, 2022
SANet: A Slice-Aware Network for Pulmonary Nodule Detection

SANet: A Slice-Aware Network for Pulmonary Nodule Detection This paper (SANet) has been accepted and early accessed in IEEE TPAMI 2021. This code and

Jie Mei 39 Dec 17, 2022
In this project I played with mlflow, streamlit and fastapi to create a training and prediction app on digits

Fastapi + MLflow + streamlit Setup env. I hope I covered all. pip install -r requirements.txt Start app Go in the root dir and run these Streamlit str

76 Nov 23, 2022
code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022

Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022 News (03/16/2022) upload retrieval checkpoints finetuned on COCO and Flickr T

187 Jan 02, 2023
Progressive Image Deraining Networks: A Better and Simpler Baseline

Progressive Image Deraining Networks: A Better and Simpler Baseline [arxiv] [pdf] [supp] Introduction This paper provides a better and simpler baselin

190 Dec 01, 2022
Project page for End-to-end Recovery of Human Shape and Pose

End-to-end Recovery of Human Shape and Pose Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik CVPR 2018 Project Page Requirements Pyt

1.4k Dec 29, 2022
Code examples and benchmarks from the paper "Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician's Perspective"

Code For the Paper "Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician's Perspective" Author: Robert Bamler Date: 22 D

4 Nov 02, 2022
2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

2020CCF-NER 2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案 bert base + flat + crf + fgm + swa + pu learning策略 + clue数据集 = test1单模0.906 词向量

67 Oct 19, 2022