Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Overview

Maths from examples - Learning advanced mathematical computations from examples

This is the source code and data sets relevant to the paper Learning advanced mathematical computations from examples, by Amaury hayat, François Charton and Guillaume Lample, published by ICLR 2021.

We provide code for

  • data generation
  • model training
  • model evaluation

We also provide

  • 7 datasets
  • 7 pretrained models

Dependencies

  • Python (3.8+)
  • Numpy (1.16.4+)
  • Sympy (1.4+)
  • Pytorch (1.7.1+)
  • Control library (0.8.4, from conda-forge)
  • CUDA (i.e. a NVIDIA chip) if you intend to use a GPU
  • Apex for half-precision training

Important notes

Learning with and without GPU

All the code can run on CPU only (set parameter --cpu to true). Data generation is to be done on CPU only. Model training and model evaluation can be done on CPU, but training will be extremely slow. To train or evaluate with a GPU, you need a CUDA-enabled GPU (i.e. a NVIDIA chip).

We support:

  • Half-Precision (with NVIDIA Apex library): set parameters --fp16 true --amp 2, to disable, set --fp16 false --amp -1
  • Multi-GPU training: to run an experiment with several GPU on a unique machine, use
export NGPU=8; python -m torch.distributed.launch --nproc_per_node=$NGPU train.py  # parameters for your experiment
  • Multi-node training: using GPU on different machines is handled by SLURM (see code)

On GPU with limited video memory, you will need to reduce memory usage by adjusting --batch_size. Try to set it to the largest value that will fit in your CUDA memory. Since model optimization is performed at the end of each minibatch, smaller batch sizes will gratly slow learning. You can compensate for this by increasing --accumulate_gradient, which controls the number of mini-batches the model sees before optimizing the model.

Dump paths and experiment names

All paths should be absolute : --dump_path ./mydump might not work, --dump_path c:/Users/me/mydump should be fine. The directories where your datasets, models, and logfiles will be generated are constructed from the parameters --dump_path --exp_name and --exp_id, as {dump_path}/{exp_name}/{exp_id}/, if you do not specify an exp_id, a random unique name will be created for you. If you reuse the same dump_path/exp/name/exp_id, generation or training will resume there (adding new examples, or loading the previous model for training).

All results will be logged in file train.logof the experiment path.

All models and datasets can be downloaded from https://dl.fbaipublicfiles.com/MathsFromExamples/. By convention, in all code examples, datasets and models use the path /checkpoint/fcharton/dumped/. You will need to adjust this to the correct path on your local machine.

Data sets

We provide 7 datasets, all can be found on https://dl.fbaipublicfiles.com/MathsFromExamples/data/ as tar.gz archives.

Stability : balanced sample of systems of degree 2 to 5 (50% stable), predicting speed of convergence at 0.01 (largest real part of eigenvalue):

in archive https://dl.fbaipublicfiles.com/MathsFromExamples/data/ddss_stability_balanced.tar.gz

  • ddss_stability_balanced.prefix_counts.train : 25,544,975 systems
  • ddss_stability_balanced.prefix_counts.valid.final : 10,000 systems
  • ddss_stability_balanced.prefix_counts.test.final : 10,000 systems

Stability : random sample of systems of degree 2 to 6, predicting speed of convergence at 0.01

in archive https://dl.fbaipublicfiles.com/MathsFromExamples/data/ddss_stability.tar.gz

  • ddss_stability.prefix_counts.train : 92,994,423 systems
  • ddss_stability.prefix_counts.valid.final : 10,000 systems
  • ddss_stability.prefix_counts.test.final : 10,000 systems

Controllability: balanced sample of systems of degree 3 to 5 (50% stable), predicting controllability (a binary value)

in archive https://dl.fbaipublicfiles.com/MathsFromExamples/data/ddss_control.tar.gz

  • ddss_control.prefix_counts.train : 26,577,934 systems
  • ddss_control.prefix_counts.valid.final : 10,000 systems
  • ddss_control.prefix_counts.test.final : 10,000 systems

Controllability: sample of controllable systems of degree 3 to 6, predicting a control matrix

in archive https://dl.fbaipublicfiles.com/MathsFromExamples/data/ddss_gram.tar.gz

  • ddss_gram.prefix_counts.train : 53,680,092 systems
  • ddss_gram.prefix_counts.valid.final : 10,000 systems
  • ddss_gram.prefix_counts.test.final : 10,000 systems

Non autonomous controllability: random sample (82.4% controllable) of systems of degree 2 and 3, predicting controllability

in archive https://dl.fbaipublicfiles.com/MathsFromExamples/data/ddss_control_t.tar.gz

  • ddss_control_t.prefix_counts.train : 65,754,655 systems
  • ddss_control_t.prefix_counts.valid.final : 10,000 systems
  • ddss_control_t.prefix_counts.test.final : 10,000 systems

Non autonomous controllability: balanced sample (50/50) of systems of degree 2 and 3, predicting controllability

in archive https://dl.fbaipublicfiles.com/MathsFromExamples/data/ddss_control_t_bal.tar.gz

  • ddss_control_t_bal.prefix_counts.train : 23,125,016 systems
  • ddss_control_t_bal.prefix_counts.valid.final : 10,000 systems
  • ddss_control_t_bal.prefix_counts.test.final : 10,000 systems

Partial differential equations with initial conditions, predicting existence of a solution and behavior at infinity

in archive https://dl.fbaipublicfiles.com/MathsFromExamples/data/ddss_fourier.tar.gz

  • ddss_fourier.prefix_counts.train : 52,285,760 systems
  • ddss_fourier.prefix_counts.valid.final : 10,000 systems
  • ddss_fourier.prefix_counts.test.final : 10,000 systems

Training a model from a dataset

python train.py 

# experiment parameters 
# the full path of this experiment will be /checkpoint/fcharton/dumped/ddss_ctrl/exp_1
--dump_path '/checkpoint/fcharton/dumped'   # path for log files and saved models, avoid ./ and other non absolute paths
--exp_name ddss_ctrl                        # name
--exp_id exp_1                              # id : randomly generated if absent

# dataset
--export_data false
--tasks ode_control         # set to `ode_convergence_speed`, `ode_control` or `fourier_cond_init`
# '{tasks},{train_file_path},{valid_file_path},{test_file_path}'
--reload_data 'ode_control,/checkpoint/fcharton/dumped/ddss_gen_ctrl/ddss_control.prefix_counts.train,/checkpoint/fcharton/dumped/ddss_gen_ctrl/ddss_control.prefix_counts.valid.final,/checkpoint/fcharton/dumped/ddss_gen_ctrl/ddss_control.prefix_counts.test.final' 
--reload_size 40000000      # nr of records to load
--max_len 512               # max length of input or output

# model parameters
--emb_dim 512 
--n_enc_layers 6 
--n_dec_layers 6 
--n_heads 8 
--optimizer 'adam_inverse_sqrt,warmup_updates=10000,lr=0.0001,weight_decay=0.01'

# training parameters
--batch_size 256        # minibatch size, reduce to fit available GPU memory
--epoch_size 300000     # how often evaluation on validation set is performed
--beam_eval 0           # use beam search for evaluation (set to 1 for quantitative tasks)
--eval_size 10000       # size of validation set
--batch_size_eval 256   # batchs for validation, reduce to adjust memory

# validation metrics
# valid_{task}_acc or valid_{task}_beam_acc depending on whether beam search is used  
--validation_metrics valid_ode_control_acc 
# stop after no increase in 20 epochs
--stopping_criterion 'valid_ode_control_acc,20' 

Generating your own data sets

To generate a dataset, use the parameters

python train.py --cpu true --export_data true  --reload_data '' --env_base_seed -1  --num_workers 20 --task # task specific parameters 

Generated data (exported as sequences of tokens) will be written in file data.prefix in the dump path of the experiment. To be used for training, these files need to be post-processed as shown in the examples below.

IMPORTANT NOTE : Data generation is very slow, and sometimes results in errors that cause the program to abort and need to be relaunched. Typical generating speeds are one or a few systems per second. Whereas one might want to use this code to experiment with data generation, creating datasets on which our models can be trained (10 million examples or more) requires a lot of computing power (typically 200-300 experiments, with 20 CPU each, running for several days)

Important parameters for data generation are :

  • --tasks : ode_convergence_speed, ode_control or fourier_cond_init
  • --cpu : always set to true
  • --num_workers : set to the number of cores you can use
  • --env_base_seed : set to -1
  • --min_degree and --max_degree : bounds for the size of the systems generated
    For more details, see file 'envs/ode.py' in the source code

Predicting stability - balanced sample (50% stable), systems of degree 2 to 5

# Generation command
python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 false --amp -1 --emb_dim 128 --n_enc_layers 2 --n_dec_layers 2 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --batch_size 32 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --accumulate_gradients 1 --env_name ode --max_int 10 --precision 2 --skip_zero_gradient true --positive false --nonnull true --prob_int 0.3 --min_degree 2 --max_degree 5 --eval_value 0.01 --prob_positive 0.5 --num_workers 20 --cpu true --stopping_criterion '' --validation_metrics '' --export_data true --reload_data '' --tasks ode_convergence_speed --env_base_seed -1 --exp_name ddss_gen_stab_bal

# Post-processing
# assemble raw data file from prefixes
cat */data.prefix \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
> ddss_stability_balanced.prefix_counts

# create train, valid and test samples
python ~/MathsFromExamples/split_data.py ddss_stability_balanced.prefix_counts 10000

# check valid and test for duplicates and remove them
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_stability_balanced.prefix_counts.train ddss_stability_balanced.prefix_counts.valid > ddss_stability_balanced.prefix_counts.valid.final
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_stability_balanced.prefix_counts.train ddss_stability_balanced.prefix_counts.test > ddss_stability_balanced.prefix_counts.test.final

Predicting stability - random sample, systems of degree 2 to 6

# Generation command
python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 false --amp -1 --emb_dim 128 --n_enc_layers 2 --n_dec_layers 2 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --batch_size 32 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --accumulate_gradients 1 --env_name ode --max_int 10 --precision 2 --skip_zero_gradient true --positive false --nonnull true --prob_int 0.3 --min_degree 2 --max_degree 6 --eval_value 0.01 --num_workers 20 --cpu true --stopping_criterion '' --validation_metrics '' --export_data true --reload_data '' --tasks ode_convergence_speed --env_base_seed -1 --exp_name ddss_gen_stab

# assemble raw data file from prefixes
cat */data.prefix \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
> ddss_stability.prefix_counts
 
# create train, valid and test samples 
python ~/MathsFromExamples/split_data.py ddss_stability.prefix_counts 10000

# check valid and test for duplicates and remove them
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_stability.prefix_counts.train ddss_stability.prefix_counts.valid > ddss_stability.prefix_counts.valid.final
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_stability.prefix_counts.train ddss_stability.prefix_counts.test > ddss_stability.prefix_counts.test.final

Predicting controllability - balanced sample, systems of degree 3 to 6

# generation command 
python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 false --amp -1 --emb_dim 128 --n_enc_layers 2 --n_dec_layers 2 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --batch_size 32 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --accumulate_gradients 1 --env_name ode --max_int 10 --precision 3 --skip_zero_gradient true --positive false --nonnull true --prob_int 0.3 --min_degree 3 --max_degree 6 --eval_value 0.9 --allow_complex false --jacobian_precision 3 --qualitative true --num_workers 20 --cpu true --stopping_criterion '' --validation_metrics '' --export_data true --reload_data '' --tasks ode_control --env_base_seed -1 --exp_name ddss_gen_ctrl

# assemble non controllable cases from prefixes
cat */data.prefix \
| grep '0$' \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
> ddss_control.prefix_counts.0

# count them
wc -l ddss_control.prefix_counts.0   # 13,298,967

# assemble controllable cases from prefixes
cat */data.prefix \
| grep '1$' \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
| head -n 13298967 > ddss_control.prefix_counts.1

# assemble prefix_counts
cat ddss_control.prefix_counts.0 ddss_control.prefix_counts.1 | shuf > ddss_control.prefix_counts

# create train, valid and test samples
python ~/MathsFromExamples/split_data.py ddss_control.prefix_counts 10000

# check valid and test for duplicates and remove them
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_control.prefix_counts.train ddss_control.prefix_counts.valid > ddss_control.prefix_counts.valid.final
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_control.prefix_counts.train ddss_control.prefix_counts.test > ddss_control.prefix_counts.test.final

Predicting non autonomous controllability: unbalanced sample, systems of 2 to 3 equations

# generation command 
python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 false --amp -1 --emb_dim 128 --n_enc_layers 2 --n_dec_layers 2 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --batch_size 32 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --accumulate_gradients 1 --env_name ode --max_int 10 --precision 3 --skip_zero_gradient true --positive false --nonnull true --prob_int 0.3 --min_degree 2 --max_degree 3 --eval_value 0.5 --allow_complex false --jacobian_precision 3 --qualitative false --tau 1 --num_workers 20 --cpu true --stopping_criterion '' --validation_metrics '' --export_data true --reload_data '' --tasks ode_control --env_base_seed -1 --exp_name ddss_gen_ctrl_t

# assemble raw data file from prefixes
cat */data.prefix \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
> ddss_control_t.prefix_counts

# create train, valid and test samples
python ~/MathsFromExamples/split_data.py ddss_control_t.prefix_counts 10000

# check valid and test for duplicates and remove them
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_control_t.prefix_counts.train ddss_control_t.prefix_counts.valid > ddss_control_t.prefix_counts.valid.final
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_control_t.prefix_counts.train ddss_control_t.prefix_counts.test > ddss_control_t.prefix_counts.test.final

Predicting non autonomous controllability: balanced sample, systems of 2 to 3 equations

# generation command 
python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 false --amp -1 --emb_dim 128 --n_enc_layers 2 --n_dec_layers 2 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --batch_size 32 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --accumulate_gradients 1 --env_name ode --max_int 10 --precision 3 --skip_zero_gradient true --positive false --nonnull true --prob_int 0.3 --min_degree 2 --max_degree 3 --eval_value 0.5 --allow_complex false --jacobian_precision 3 --qualitative false --tau 1 --num_workers 20 --cpu true --stopping_criterion '' --validation_metrics '' --export_data true --reload_data '' --tasks ode_control --env_base_seed -1 --exp_name ddss_gen_ctrl_t

# assemble non controllable cases from prefixes
cat */data.prefix \
| grep '0$' \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
> ddss_control_t.prefix_counts.0

# count them
wc -l ddss_control_t.prefix_counts.0   # 11,572,508

# assemble controllable cases from prefixes
cat */data.prefix \
| grep '1$' \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
| head -n 11572508 > ddss_control_t.prefix_counts.1

# assemble prefix_counts
cat ddss_control_t.prefix_counts.0 ddss_control_t.prefix_counts.1 | shuf > ddss_control_t_bal.prefix_counts

# create train, valid and test samples
python ~/MathsFromExamples/split_data.py ddss_control_t_bal.prefix_counts 10000

# check valid and test for duplicates and remove them
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_control_t_bal.prefix_counts.train ddss_control_t_bal.prefix_counts.valid > ddss_control_t_bal.prefix_counts.valid.final
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_control_t_bal.prefix_counts.train ddss_control_t_bal.prefix_counts.test > ddss_control_t_bal.prefix_counts.test.final

Predicting control matrices - sample of controllable systems, of degree 3 to 6

# generation command
python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 false --amp -1 --emb_dim 128 --n_enc_layers 2 --n_dec_layers 2 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --batch_size 32 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --accumulate_gradients 1 --env_name ode --max_int 10 --precision 3 --skip_zero_gradient true --positive false --nonnull true --prob_int 0.3 --min_degree 3 --max_degree 6 --eval_value 0.5 --allow_complex false --jacobian_precision 2 --qualitative false --predict_gramian true --prob_positive 1.0 --num_workers 20 --cpu true --stopping_criterion '' --validation_metrics '' --export_data true --reload_data '' --tasks ode_control --env_base_seed -1 --exp_name ddss_gen_gram

# assemble raw data file from prefixes
cat */data.prefix \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
> ddss_gram.prefix_counts
 
# create train, valid and test samples 
python ~/MathsFromExamples/split_data.py ddss_gram.prefix_counts 10000

# check valid and test for duplicates and remove them
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_gram.prefix_counts.train ddss_gram.prefix_counts.valid > ddss_gram.prefix_counts.valid.final
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_gram.prefix_counts.train ddss_gram.prefix_counts.test > ddss_gram.prefix_counts.test.final

Predicting the existence of solutions of partial differential equations

# generation command
python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 false --amp -1 --emb_dim 128 --n_enc_layers 2 --n_dec_layers 2 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --batch_size 32 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --accumulate_gradients 1 --env_name ode --max_int 10 --precision 2 --jacobian_precision 2 --positive false --nonnull true --allow_complex false --predict_bounds true --skip_zero_gradient true --prob_int 0.3 --min_degree 2 --max_degree 6 --eval_value 0.01 --prob_positive -1.0 --num_workers 20 --cpu true --stopping_criterion '' --validation_metrics '' --export_data true --reload_data '' --tasks fourier_cond_init --env_base_seed -1 --exp_name ddss_gen_fourier

# assemble raw data file from prefixes
cat */data.prefix \
| awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_desc"}{c[$0]++}END{for (i in c) printf("%i|%s\n",c[i],i)}' \
> ddss_fourier.prefix_counts
 
# create train, valid and test samples 
python ~/MathsFromExamples/split_data.py ddss_fourier.prefix_counts 10000

# check valid and test for duplicates and remove them
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_fourier.prefix_counts.train ddss_fourier.prefix_counts.valid > ddss_fourier.prefix_counts.valid.final
awk -F"[|\t]" 'NR==FNR { lines[$2]=1; next } !($2 in lines)' ddss_fourier.prefix_counts.train ddss_fourier.prefix_counts.test > ddss_fourier.prefix_counts.test.final

Pre-trained models

We provide 7 pretrained models for the various problems. Below are the links, the dataset they were trained on, and the parameters used, and the performance on the validation set (valid.final in the same directory, 10 000 held-out examples).

Predicting stability (qualitative)

python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 true --amp 2 --accumulate_gradients 1 --emb_dim 512 --batch_size 128 --batch_size_eval 256 --n_enc_layers 6 --n_dec_layers 6 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --num_workers 1 --export_data false --env_name ode --max_int 10 --positive false --nonnull true --qualitative true --skip_zero_gradient true --prob_int 0.3 --max_degree 5 --min_degree 2 --eval_verbose 0 --beam_eval 0 --eval_size 10000 --tasks ode_convergence_speed --reload_data 'ode_convergence_speed,/checkpoint/fcharton/dumped/ddss_gen_stab_bal/ddss_stability_balanced.prefix_counts.train,/checkpoint/fcharton/dumped/ddss_gen_stab_bal/ddss_stability_balanced.prefix_counts.valid.final,/checkpoint/fcharton/dumped/ddss_gen_stab_bal/ddss_stability_balanced.prefix_counts.test.final' --reload_size 40000000 --stopping_criterion 'valid_ode_convergence_speed_acc,40' --validation_metrics valid_ode_convergence_speed_acc --env_base_seed -1 --exp_name ddss_stab_quali

Stability: computing convergence speed

python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 true --amp 2 --accumulate_gradients 1 --emb_dim 1024 --batch_size 128 --batch_size_eval 256 --n_enc_layers 8 --n_dec_layers 8 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --optimizer 'adam_inverse_sqrt,warmup_updates=10000,lr=0.0001,weight_decay=0.01' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --num_workers 1 --export_data false --env_name ode --max_int 10 --positive false --nonnull true --skip_zero_gradient true --prob_int 0.3 --max_degree 6 --min_degree 2 --eval_verbose 0 --beam_eval 1 --eval_size 10000 --tasks ode_convergence_speed --reload_data 'ode_convergence_speed,/checkpoint/fcharton/dumped/ddss_gen_stab/ddss_stability.prefix_counts.train,/checkpoint/fcharton/dumped/ddss_gen_stab/ddss_stability.prefix_counts.valid.final,/checkpoint/fcharton/dumped/ddss_gen_stab/ddss_stability.prefix_counts.test.final' --reload_size 40000000 --stopping_criterion 'valid_ode_convergence_speed_beam_acc,40' --validation_metrics valid_ode_convergence_speed_beam_acc --env_base_seed -1 --exp_name ddss_stab_quanti

Predicting autonomous controllability

 python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 true --amp 2 --accumulate_gradients 1 --emb_dim 512 --batch_size 256 --batch_size_eval 256 --n_enc_layers 6 --n_dec_layers 6 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --optimizer 'adam_inverse_sqrt,warmup_updates=10000,lr=0.0001,weight_decay=0.01' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --num_workers 1 --export_data false --env_name ode --max_int 10 --positive false --nonnull true --skip_zero_gradient true --prob_int 0.3 --max_degree 6 --min_degree 3 --eval_value 0.9 --qualitative true --eval_verbose 0 --beam_eval 0 --eval_size 10000 --tasks ode_control --reload_data 'ode_control,/checkpoint/fcharton/dumped/ddss_gen_ctrl/ddss_control.prefix_counts.train,/checkpoint/fcharton/dumped/ddss_gen_ctrl/ddss_control.prefix_counts.valid.final,/checkpoint/fcharton/dumped/ddss_gen_ctrl/ddss_control.prefix_counts.test.final' --reload_size 40000000 --stopping_criterion 'valid_ode_control_acc,20' --validation_metrics valid_ode_control_acc --env_base_seed -1 --exp_name ddss_ctrl

Predicting non-autonomous controllability

python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 true --amp 2 --accumulate_gradients 1 --emb_dim 512 --batch_size 256 --batch_size_eval 256 --n_enc_layers 6 --n_dec_layers 6 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --optimizer 'adam_inverse_sqrt,warmup_updates=10000,lr=0.0001,weight_decay=0.01' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --num_workers 1 --export_data false --env_name ode --max_int 10 --positive false --nonnull true --skip_zero_gradient true --prob_int 0.3 --max_degree 3 --min_degree 2 --eval_value 0.5 --qualitative false --tau 1 --eval_verbose 0 --beam_eval 0 --eval_size 10000 --tasks ode_control --reload_data 'ode_control,/checkpoint/fcharton/dumped/ddss_gen_ctrl_t/ddss_control_t.prefix_counts.train,/checkpoint/fcharton/dumped/ddss_gen_ctrl_t/ddss_control_t.prefix_counts.valid.final,/checkpoint/fcharton/dumped/ddss_gen_ctrl_t/ddss_control_t.prefix_counts.test.final' --reload_size 40000000 --stopping_criterion 'valid_ode_control_acc,60' --validation_metrics valid_ode_control_acc --env_base_seed -1 --exp_name ddss_ctrl_t

Computing control matrices: predicting solution up to 10%

python /private/home/fcharton/workdir/ddss_gram/2021_03_18_12_05_11/train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 true --amp 2 --accumulate_gradients 1 --emb_dim 512 --batch_size 128 --batch_size_eval 128 --n_enc_layers 6 --n_dec_layers 6 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --num_workers 1 --export_data false --env_name ode --max_int 10 --positive false --nonnull true --skip_zero_gradient true --prob_int 0.3 --max_degree 6 --min_degree 3 --eval_value 0.5 --predict_gramian true --euclidian_metric true --auxiliary_task false --eval_verbose 0 --beam_eval 1 --eval_size 10000 --tasks ode_control --reload_data 'ode_control,/checkpoint/fcharton/dumped/ddss_gen_gram/ddss_gram.prefix_counts.train,/checkpoint/fcharton/dumped/ddss_gen_gram/ddss_gram.prefix_counts.valid.final,/checkpoint/fcharton/dumped/ddss_gen_gram/ddss_gram.prefix_counts.test.final' --reload_size 50000000 --stopping_criterion 'valid_ode_control_beam_acc,40' --validation_metrics valid_ode_control_beam_acc --env_base_seed -1 --exp_name ddss_gram

Computing control matrices: predicting a correct mathematical solution

python /private/home/fcharton/workdir/ddss_gram/2021_03_09_12_09_38/train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 true --amp 2 --accumulate_gradients 1 --emb_dim 512 --batch_size 128 --batch_size_eval 128 --n_enc_layers 6 --n_dec_layers 6 --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 512 --optimizer 'adam,lr=0.0001' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --num_workers 1 --export_data false --env_name ode --max_int 10 --positive false --nonnull true --skip_zero_gradient true --prob_int 0.3 --max_degree 6 --min_degree 3 --eval_value 0.5 --predict_gramian true --euclidian_metric false --auxiliary_task false --eval_verbose 0 --beam_eval 1 --eval_size 10000 --tasks ode_control --reload_data 'ode_control,/checkpoint/fcharton/dumped/ddss_gen_gram/ddss_gram.prefix_counts.train,/checkpoint/fcharton/dumped/ddss_gen_gram/ddss_gram.prefix_counts.valid.final,/checkpoint/fcharton/dumped/ddss_gen_gram/ddss_gram.prefix_counts.test.final' --reload_size 40000000 --stopping_criterion 'valid_ode_control_beam_acc,20' --validation_metrics valid_ode_control_beam_acc --env_base_seed -1 --exp_name ddss_gram

Predicting the existence of solutions of partial differential equations

python train.py --dump_path '/checkpoint/fcharton/dumped' --save_periodic 0 --fp16 false --amp -1 --accumulate_gradients 1 --emb_dim 512 --n_enc_layers 8 --n_dec_layers 8 --batch_size 64 --batch_size_eval 64 --eval_size 10000 --predict_jacobian false --n_heads 8 --dropout 0 --attention_dropout 0 --share_inout_emb true --sinusoidal_embeddings false --max_len 1024 --optimizer 'adam_inverse_sqrt,warmup_updates=10000,lr=0.0001,weight_decay=0.01' --clip_grad_norm 5 --epoch_size 300000 --max_epoch 100000 --num_workers 1 --export_data false --env_name ode --max_int 10 --precision 3 --jacobian_precision 1 --positive false --nonnull true --prob_int '0.3' --max_degree 6 --eval_value 0.5 --allow_complex false --predict_bounds true --skip_zero_gradient true --eval_verbose 0 --beam_eval 0 --tasks fourier_cond_init --reload_data 'fourier_cond_init,/checkpoint/fcharton/dumped/ddss_gen_fourier/ddss_fourier.prefix_counts.train,/checkpoint/fcharton/dumped/ddss_gen_fourier/ddss_fourier.prefix_counts.valid,/checkpoint/fcharton/dumped/ddss_gen_fourier/ddss_fourier.prefix_counts.test' --reload_size 40000000 --stopping_criterion 'valid_fourier_cond_init_acc,20' --validation_metrics valid_fourier_cond_init_acc --env_base_seed -1 --exp_name ddss_fourier

Evaluating trained models

To evaluate over a trained model model.pth on a specific test set test.data, run the model with the same parameters as training, setting --eval_only trueand --reload_model to the path to your model (e.g. --reload_model /model_path/model.pth), and setting the second file --reload_datato your test data (e.g. --reload_data 'ode_control,/checkpoint/fcharton/dumped/ddss_gen_gram/ddss_gram.prefix_counts.train,/MYPATH/test.data,/checkpoint/fcharton/dumped/ddss_gen_gram/ddss_gram.prefix_counts.test.final'). Set --eval_sizeto the size of your dataset. At present, only the validation dataset is used for evaluation, but you can change this by toggling comments on lines 367 and 368 of file evaluator.py.

Citation

This code is released under a Creative Commons License, see LICENCE file for more details. If you use this code, consider citing

@misc{charton2021learning, title={Learning advanced mathematical computations from examples}, author={François Charton and Amaury Hayat and Guillaume Lample}, year={2021}, eprint={2006.06462}, archivePrefix={arXiv}, primaryClass={cs.LG} }

Owner
Facebook Research
Facebook Research
Official repository of the paper 'Essentials for Class Incremental Learning'

Essentials for Class Incremental Learning Official repository of the paper 'Essentials for Class Incremental Learning' This Pytorch repository contain

33 Nov 27, 2022
This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

RobustFreqCNN About This repository contains the implementation of the paper "Towards Frequency-Based Explanation for Robust CNN" arxiv. It primarly d

Sarosij Bose 2 Jan 23, 2022
Editing a Conditional Radiance Field

Editing Conditional Radiance Fields Project | Paper | Video | Demo Editing Conditional Radiance Fields Steven Liu, Xiuming Zhang, Zhoutong Zhang, Rich

Steven Liu 216 Dec 30, 2022
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

32 Sep 21, 2022
This repository lets you interact with Lean through a REPL.

lean-gym This repository lets you interact with Lean through a REPL. See Formal Mathematics Statement Curriculum Learning for a presentation of lean-g

OpenAI 87 Dec 28, 2022
Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022)

Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022) Junjie Ye, Changhong Fu, Guangze Zheng, Danda Pani Paudel, and Guang Chen. Uns

Intelligent Vision for Robotics in Complex Environment 91 Dec 30, 2022
Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

This is an official Pytorch implementation of the approaches proposed in: Han Peng, Ge Li, Wenhan Wang, Yunfei Zhao, Zhi Jin “Integrating Tree Path in

Han Peng 16 Dec 23, 2022
A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

Kakao Brain 604 Dec 14, 2022
Official implementation of "Learning Not to Reconstruct" (BMVC 2021)

Official PyTorch implementation of "Learning Not to Reconstruct Anomalies" This is the implementation of the paper "Learning Not to Reconstruct Anomal

Marcella Astrid 13 Dec 04, 2022
🎃 Core identification module of AI powerful point reading system platform.

ppReader-Kernel Intro Core identification module of AI powerful point reading system platform. Usage 硬件: Windows10、GPU:nvdia GTX 1060 、普通RBG相机 软件: con

CrashKing 1 Jan 11, 2022
Adversarial-autoencoders - Tensorflow implementation of Adversarial Autoencoders

Adversarial Autoencoders (AAE) Tensorflow implementation of Adversarial Autoencoders (ICLR 2016) Similar to variational autoencoder (VAE), AAE imposes

Qian Ge 236 Nov 13, 2022
Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach

Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach This is the implementation of traffic prediction code in DTMP based on PyTo

chenxin 1 Dec 19, 2021
An open source app to help calm you down when needed.

By: Seanpm2001, Et; Al. Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans |

Sean P. Myrick V19.1.7.2 2 Oct 24, 2022
Deep Image Matting implementation in PyTorch

Deep Image Matting Deep Image Matting paper implementation in PyTorch. Differences "fc6" is dropped. Indices pooling. "fc6" is clumpy, over 100 millio

Yang Liu 724 Dec 27, 2022
[ArXiv 2021] One-Shot Generative Domain Adaptation

GenDA - One-Shot Generative Domain Adaptation One-Shot Generative Domain Adaptation Ceyuan Yang*, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Z

GenForce: May Generative Force Be with You 46 Dec 19, 2022
"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices This repository contains the official PyTorch implemen

Yandex Research 21 Oct 18, 2022
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting Created by Yongming Rao*, Wenliang Zhao*, Guangyi Chen, Yansong Tang, Zheng Z

Yongming Rao 321 Dec 27, 2022
LSSY量化交易系统

LSSY量化交易系统 该项目是本人3年来研究量化慢慢积累开发的一套系统,属于早期作品慢慢修改而来,仅供学习研究,回测分析,实盘交易部分未公开

55 Oct 04, 2022
Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021 Oral) Run this model on Replicate Optimization: Global directions: Mapper: Check ou

3.3k Jan 05, 2023
PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

Responsible AI Toolbox A library that provides high-quality, PyTorch-centric tools for evaluating and enhancing both the robustness and the explainabi

24 Dec 22, 2022