Official implementation for "Symbolic Learning to Optimize: Towards Interpretability and Scalability"

Overview

Symbolic Learning to Optimize

This is the official implementation for ICLR-2022 paper "Symbolic Learning to Optimize: Towards Interpretability and Scalability"

Introduction

Recent studies on Learning to Optimize (L2O) suggest a promising path to automating and accelerating the optimization procedure for complicated tasks. Existing L2O models parameterize optimization rules by neural networks, and learn those numerical rules via meta-training. However, they face two common pitfalls: (1) scalability: the numerical rules represented by neural networks create extra memory overhead for applying L2O models, and limits their applicability to optimizing larger tasks; (2) interpretability: it is unclear what each L2O model has learned in its black-box optimization rule, nor is it straightforward to compare different L2O models in an explainable way. To avoid both pitfalls, this paper proves the concept that we can ``kill two birds by one stone'', by introducing the powerful tool of symbolic regression to L2O. In this paper, we establish a holistic symbolic representation and analysis framework for L2O, which yields a series of insights for learnable optimizers. Leveraging our findings, we further propose a lightweight L2O model that can be meta-trained on large-scale problems and outperformed human-designed and tuned optimizers. Our work is set to supply a brand-new perspective to L2O research.

Our approach:

First train a neural network (LSTM) based optimizer, then leverage the symbolic regression tool to trouble shoot and analyze the neural network based optimizer. The yielded symbolic rule serve as a light weight light-weight surrogate of the original optimizer.

Our main findings:

Example of distilled equations from DM model:

Example of distilled equations from RP model (they are simpler than the DM surrogates, and yet more effective for the optimization task):

Distilled symbolic rules fit the optimizer quite well:

The distilled symbolic rule and underlying rules

Distilled symbolic rules perform same optimization task well, compared with the original numerical optimizer:

The light weight symbolic rules are able to be meta-tuned on large scale (ResNet-50) optimizee and get good performance:

ss large scale optimizee

The symbolic regression passed the sanity checks in the optimization tasks:

Installation Guide

The installation require no special packages. The tensorflow version we adoped is 1.14.0, and the PyTorch version we adopted is 1.7.1.

Training Guide

The three files:

torch-implementation/l2o_train_from_scratch.py

torch-implementation/l2o_symbolic_regression_stage_2_3.py

torch-implementation/l2o_evaluation.py

are pipline scripts, which integrate the multi-stage experiments. The detailed usages are specified within these files. We offer several examples below.

  • In order to train a rnn-prop model from scratch on mnist classification problem setting with 200 epochs, each epoch with length 200, unroll length 20, batch size 128, learning rate 0.001 on GPU-0, run:

    python l2o_train_from_scratch.py -m tras -p mni -n 200 -l 200 -r 20 -b 128 -lr 0.001 -d 0

  • In order to fine-tune an L2O model on the CNN optimizee with 200 epochs, each epoch length 1000, unroll length 20, batch size 64, learning rate 0.001 on GPU-0, first put the .pth model checkpoint file (the training script above will automatically save it in a new folder under current directory) under the first (0-th, in the python index) location in __WELL_TRAINED__ specified in torch-implementation/utils.py , then run the following script:

    python l2o_train_from_scratch.py -m tune -pr 0 -p cnn -n 200 -l 1000 -r 20 -b 64 -lr 0.001 -d 0

  • In order to generate data for symbolic regression, if desire to obtain 50000 samples evaluated on MNIST classification problem, with optimization trajectory length of 300 steps, using GPU-3, then run:

    python l2o_evaluation.py -m srgen -p mni -l 300 -s 50000 -d 3

  • In order to distill equation from the previously saved offline SR dataset, check and run: torch-implementation/sr_train.py

  • In order to fine-tune SR equation, check and run: torch-implementation/stage023_mid2021_update.py

  • In order to convert distilled symbolic equation into latex readable form, check and run: torch-implementation/sr_test_get_latex.py.py

  • In order to calculate how good the symbolic is fitting the original model, we use the R2-scores; to compute it, check and run: torch-implementation/sr_test_cal_r2.py

  • In order to train and run the resnet-class optimizees, check and run: torch-implementation/run_resnet.py

There are also optional tensorflow implementations of L2O, including meta-training the two benchmarks used in this paper: DM and Rnn-prop L2O. However, all steps before generating offline datasets in the pipline is only supportable with torch implementations. To do symbolic regression with tensorflow implementation, you need to manually generate records (an .npy file) of shape [N_sample, num_feature+1], which concatenate the num_feature dimensional x (symbolic regresison input) and 1 dimensional y (output), containing N_sample samples. Once behavior dataset is ready, the following steps can be shared with torch implementation.

  • In order to train the tensorflow implementation of L2O, check and run: tensorflow-implementation/train_rnnprop.py, tensorflow-implementation/train_dm.py

  • In order to evaluate the tensorflow implementation of L2O and generate offline dataset for symbolic regression, check and run: tensorflow-implementation/evaluate_rnnprop.py, tensorflow-implementation/evaluate_dm.py.

Other hints

Meta train the DM/RP/RP_si models

run the train_optimizer() functionin torch-implementation/meta.py

Evaluate the optimization performance:

run theeva_l2o_optimizer() function in torch-implementation/meta.py

RP model implementations:

TheRPOptimizer in torch-implementation/meta.py

RP_si model implementations:

same as RP, set magic=0; or more diverse input can be enabled by setting grad_features="mt+gt+mom5+mom99"

DM model implementations:

DMOptimizer in torch-implementation/utils.py

SR implementations:

torch-implementation/sr_train.py

torch-implementation/sr_test_cal_r2.py

torch-implementation/sr_test_get_latex.py

other SR options and the workflow:

srUtils.py

Citation

comming soon.

Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting Created by Yongming Rao*, Wenliang Zhao*, Guangyi Chen, Yansong Tang, Zheng Z

Yongming Rao 321 Dec 27, 2022
Convnet transfer - Code for paper How transferable are features in deep neural networks?

How transferable are features in deep neural networks? This repository contains source code necessary to reproduce the results presented in the follow

Jason Yosinski 143 Sep 13, 2022
Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab 4 Jul 19, 2022
[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

Contextual Action Language Model (CALM) and the ClubFloyd Dataset Code and data for paper Keep CALM and Explore: Language Models for Action Generation

Princeton Natural Language Processing 43 Dec 16, 2022
Solver for Large-Scale Rank-One Semidefinite Relaxations

STRIDE: spectrahedral proximal gradient descent along vertices A Solver for Large-Scale Rank-One Semidefinite Relaxations About STRIDE is designed for

48 Dec 20, 2022
The ARCA23K baseline system

ARCA23K Baseline System This is the source code for the baseline system associated with the ARCA23K dataset. Details about ARCA23K and the baseline sy

4 Jul 02, 2022
Python Fanduel API (2021) - Lineup Automation

Southpaw is a python package that provides access to the Fanduel API. Optimize your DFS experience by programmatically updating your lineups, analyzin

Brandin Canfield 13 Jan 04, 2023
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Recurrent Fast Weight Programmers This is the official repository containing the code we used to produce the experimental results reported in the pape

IDSIA 36 Nov 15, 2022
Simple SN-GAN to generate CryptoPunks

CryptoPunks GAN Simple SN-GAN to generate CryptoPunks. Neural network architecture and training code has been modified from the PyTorch DCGAN example.

Teddy Koker 66 Dec 15, 2022
Spherical CNNs

Spherical CNNs Equivariant CNNs for the sphere and SO(3) implemented in PyTorch Overview This library contains a PyTorch implementation of the rotatio

Jonas Köhler 893 Dec 28, 2022
Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Minimal code and simple experiments to play with Denoising Diffusion Probabilist

Rithesh Kumar 16 Oct 06, 2022
Tensorflow-seq2seq-tutorials - Dynamic seq2seq in TensorFlow, step by step

seq2seq with TensorFlow Collection of unfinished tutorials. May be good for educational purposes. 1 - simple sequence-to-sequence model with dynamic u

Matvey Ezhov 1k Dec 17, 2022
Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

WSDEC This is the official repo for our NeurIPS paper Weakly Supervised Dense Event Captioning in Videos. Description Repo directories ./: global conf

Melon(Xuguang Duan) 96 Nov 01, 2022
Benchmarks for semi-supervised domain generalization.

Semi-Supervised Domain Generalization This code is the official implementation of the following paper: Semi-Supervised Domain Generalization with Stoc

Kaiyang 49 Dec 10, 2022
Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing

FGHV Impelmentation for paper Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing Requirements Python 3.6 Pytorch 1.5.0 Cud

5 Jun 02, 2022
Galactic and gravitational dynamics in Python

Gala is a Python package for Galactic and gravitational dynamics. Documentation The documentation for Gala is hosted on Read the docs. Installation an

Adrian Price-Whelan 101 Dec 22, 2022
tinykernel - A minimal Python kernel so you can run Python in your Python

tinykernel - A minimal Python kernel so you can run Python in your Python

fast.ai 37 Dec 02, 2022
DL course co-developed by YSDA, HSE and Skoltech

Deep learning course This repo supplements Deep Learning course taught at YSDA and HSE @fall'21. For previous iteration visit the spring21 branch. Lec

Yandex School of Data Analysis 1.3k Dec 30, 2022
Perform Linear Classification with Multi-way Data

MultiwayClassification This is an R package to perform linear classification for data with multi-way structure. The distance-weighted discrimination (

Eric F. Lock 2 Dec 15, 2020
PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

PaddlePaddle Vision Transformers State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 🤖 PaddlePaddle Visual Transformers (PaddleViT or

1k Dec 28, 2022