Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

Last update: Jan 03, 2023

Overview

PySR

(pronounced like py as in python, and then sur as in surface)

Parallelized symbolic regression built on Julia, and interfaced by Python. Uses regularized evolution, simulated annealing, and gradient-free optimization.

Cite this software

Documentation

Check out SymbolicRegression.jl for the pure-Julia backend of this package.

Symbolic regression is a very interpretable machine learning algorithm for low-dimensional problems: these tools search equation space to find algebraic relations that approximate a dataset.

One can also extend these approaches to higher-dimensional spaces by using a neural network as proxy, as explained in 2006.11287, where we apply it to N-body problems. Here, one essentially uses symbolic regression to convert a neural net to an analytic equation. Thus, these tools simultaneously present an explicit and powerful way to interpret deep models.

Backstory:

Previously, we have used eureqa, which is a very efficient and user-friendly tool. However, eureqa is GUI-only, doesn't allow for user-defined operators, has no distributed capabilities, and has become proprietary (and recently been merged into an online service). Thus, the goal of this package is to have an open-source symbolic regression tool as efficient as eureqa, while also exposing a configurable python interface.

Installation

PySR uses both Julia and Python, so you need to have both installed.

Install Julia - see downloads, and then instructions for mac and linux. (Don't use the conda-forge version; it doesn't seem to work properly.)

You can install PySR with:

pip install pysr

The first launch will automatically install the Julia packages required.

Quickstart

Here is some demo code (also found in example.py)

import numpy as np
from pysr import pysr, best

# Dataset
X = 2*np.random.randn(100, 5)
y = 2*np.cos(X[:, 3]) + X[:, 0]**2 - 2

# Learn equations
equations = pysr(X, y, niterations=5,
    binary_operators=["plus", "mult"],
    unary_operators=[
      "cos", "exp", "sin", #Pre-defined library of operators (see https://pysr.readthedocs.io/en/latest/docs/operators/)
      "inv(x) = 1/x"]) # Define your own operator! (Julia syntax)

...# (you can use ctl-c to exit early)

print(best(equations))

which gives:

x0**2 + 2.000016*cos(x3) - 1.9999845

One can also use best_tex to get the LaTeX form, or best_callable to get a function you can call. This uses a score which balances complexity and error; however, one can see the full list of equations with:

print(equations)

This is a pandas table, with additional columns:

MSE - the mean square error of the formula
score - a metric akin to Occam's razor; you should use this to help select the "true" equation.
sympy_format - sympy equation.
lambda_format - a lambda function for that equation, that you can pass values through.

Comments

Add Support for Arbitrary Precision Arithmetic with BigFloat
Is your feature request related to a problem? Please describe. I tried running 'pysr' on a 1,000 row array with 4 integer input variables and one integer output variable - a Goedel Number.

From Mathematica:

GoedelNumber[l_List] := Times @@ MapIndexed[Prime[First[#2]]^#1 &, l]

E.g.

Data file: # 7 1 5 8 6917761200000 julia> 2^7*3^1*5^5*7^8 6917761200000

The model returned:

Complexity Loss Score Equation 1 Inf NaN 0.22984365

I am just learning 'pysr' and maybe it's just 'user error'. However, Inf and Nan suggest that Goedel numbers may exceed Float64.

Describe the solution you'd like Not sure what happened, because the largest Goedel number in the input is: 1.6679880978201e+23

Additional context I didn't see any parameters to set 'verbose' mode or 'debugging' information.

GoedelTableFourParameters.txt
enhancement good first issue
opened by dbl001 35
[Windows] : Couldn't find equation file!

Hi Miles,

I've been installing PySR in parallel to Julia under win10. It runs... till the moment it crashes with the following message:

File "C:\Users\Matthieu\anaconda3\lib\site-packages\pysr\sr.py", line 774, in get_hof raise RuntimeError("Couldn't find equation file! The equation search likely exited before a single iteration completed.")

RuntimeError: Couldn't find equation file! The equation search likely exited before a single iteration completed.

In the last case, I've been to 38% of progress.

I have to say that, sometime (not often), the process gets completed.

What is the reason for this?

Also... is there a forum or I posted at the right place?

I thank you for your help.

Regards

Magaud
bug

opened by Magaud59 27

JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions / prior installation with conda

Describe the bug

JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions [a40a106e]:
 DynamicExpressions [a40a106e] log:
 ├─DynamicExpressions [a40a106e] has no known versions!
 └─restricted to versions 0.4 by SymbolicRegression [8254be44] — no versions left
   └─SymbolicRegression [8254be44] log:
     ├─possible versions are: 0.14.4 or uninstalled
     └─SymbolicRegression [8254be44] is fixed to version 0.14.4' occurred while calling julia code:
Pkg.add([sr_spec, clustermanagers_spec], io=stderr)

Version (please include the following information): MacOS Ventura 13.0.1 (22A400)

Julia version [Run julia --version in the terminal]
julia --version julia version 1.8.3
Python version [Run python --version in the terminal]
Python 3.8.13
Did you install with pip or conda?
pip

$ conda list pysr
# packages in environment at /Users/davidlaxer/anaconda3/envs/ai:
#
# Name                    Version                   Build  Channel
pysr                      0.11.11                  pypi_0    pypi

% pip show pysr
Name: pysr
Version: 0.11.11
Summary: Simple and efficient symbolic regression
Home-page: https://github.com/MilesCranmer/pysr
Author: Miles Cranmer
Author-email: [email protected]
License: 
Location: /Users/davidlaxer/anaconda3/envs/ai/lib/python3.8/site-packages
Requires: julia, numpy, pandas, scikit-learn, sympy
Required-by:

PySR version [Run python -c 'import pysr; print(pysr.__version__)']
0.9.1
Does the bug still appear with the latest version of PySR?

Configuration

What are your PySR settings?
What dataset are you running on?
If possible, please share a minimal code example that produces the error.

Error message Add the error message here, or whatever other information would be useful for debugging.

If the error is "Couldn't find equation file...", this error indicates something went wrong with the backend. Please scroll up and copy the output of Julia, rather than the output of python.

Additional context Add any other context about the problem here.

Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  uname: Darwin 22.1.0 Darwin Kernel Version 22.1.0: Sun Oct  9 20:14:54 PDT 2022; root:xnu-8792.41.9~2/RELEASE_X86_64 x86_64 i386
  CPU: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz: 
                 speed         user         nice          sys         idle          irq
       #1-16  3800 MHz    7543546 s          0 s    3955434 s   72076495 s          0 s
  Memory: 128.0 GB (32470.4921875 MB free)
  Uptime: 951050.0 sec
  Load Avg:  8.20068359375  5.13525390625  4.3212890625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 16 virtual cores
Environment:
  JULIA_DEPOT_PATH_BACKUP = 
  JULIA_PROJECT_BACKUP = 
  JULIA_LOAD_PATH_BACKUP = 
  JULIA_DEPOT_PATH = /Users/davidlaxer/anaconda3/envs/ai/share/julia:
  JULIA_SSL_CA_ROOTS_PATH_BACKUP = 
  JULIA_SSL_CA_ROOTS_PATH = 
  JULIA_PROJECT = @pysr-0.11.11
  TERM = xterm-color
  PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/bin:/Users/davidlaxer/.juliaup/bin:/Users/davidlaxer/.cabal/bin:/Users/davidlaxer/.ghcup/bin:/Users/davidlaxer/anaconda3/envs/ai/bin:/Users/davidlaxer/anaconda3/condabin:/opt/local/bin:/opt/local/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin:/Users/davidlaxer/.cargo/bin:/Users/jetbrains/.local/bin
  XPC_FLAGS = 0x0
  HOME = /Users/davidlaxer
  JAVA_HOME = :-
  JAVA_LD_LIBRARY_PATH = :-
  CAML_LD_LIBRARY_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/stublibs:/Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/ocaml/stublibs:/Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/ocaml
  OCAML_TOPLEVEL_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/toplevel
  PKG_CONFIG_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/pkgconfig:
  CONDA_BACKUP_FFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
  CONDA_BACKUP_FORTRANFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
  CONDA_BACKUP_DEBUG_FFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
  CONDA_BACKUP_DEBUG_FORTRANFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
[ Info: Julia version info
[ Info: Julia executable: /Users/davidlaxer/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/bin/julia
[ Info: Trying to import PyCall...
┌ Info: PyCall is already installed and compatible with Python executable.
│ 
│ PyCall:
│     python: /Users/davidlaxer/anaconda3/envs/ai/bin/python
│     libpython: /Users/davidlaxer/anaconda3/envs/ai/lib/libpython3.8.dylib
│ Python:
│     python: /Users/davidlaxer/anaconda3/envs/ai/bin/python
└     libpython: 
   Resolving package versions...
---------------------------------------------------------------------------
JuliaError                                Traceback (most recent call last)
Input In [5], in <cell line: 4>()
      1 get_ipython().system('export JULIA_SSL_CA_ROOTS_PATH=""')
      2 import pysr
----> 4 pysr.install()

File ~/anaconda3/envs/ai/lib/python3.8/site-packages/pysr/julia_helpers.py:87, in install(julia_project, quiet)
     83 io_arg = _get_io_arg(quiet)
     85 if is_shared:
     86     # Install SymbolicRegression.jl:
---> 87     _add_sr_to_julia_project(Main, io_arg)
     89 Main.eval("using Pkg")
     90 Main.eval(f"Pkg.instantiate({io_arg})")

File ~/anaconda3/envs/ai/lib/python3.8/site-packages/pysr/julia_helpers.py:240, in _add_sr_to_julia_project(Main, io_arg)
    230 Main.sr_spec = Main.PackageSpec(
    231     name="SymbolicRegression",
    232     url="https://github.com/MilesCranmer/SymbolicRegression.jl",
    233     rev="v" + __symbolic_regression_jl_version__,
    234 )
    235 Main.clustermanagers_spec = Main.PackageSpec(
    236     name="ClusterManagers",
    237     url="https://github.com/JuliaParallel/ClusterManagers.jl",
    238     rev="14e7302f068794099344d5d93f71979aaf4fbeb3",
    239 )
--> 240 Main.eval(f"Pkg.add([sr_spec, clustermanagers_spec], {io_arg})")

File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:627, in Julia.eval(self, src)
    625 if src is None:
    626     return None
--> 627 ans = self._call(src)
    628 if not ans:
    629     return None

File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:555, in Julia._call(self, src)
    553 # logger.debug("_call(%s)", src)
    554 ans = self.api.jl_eval_string(src.encode('utf-8'))
--> 555 self.check_exception(src)
    557 return ans

File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:609, in Julia.check_exception(self, src)
    607 else:
    608     exception = sprint(showerror, self._as_pyobj(res))
--> 609 raise JuliaError(u'Exception \'{}\' occurred while calling julia code:\n{}'
    610                  .format(exception, src))

JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions [a40a106e]:
 DynamicExpressions [a40a106e] log:
 ├─DynamicExpressions [a40a106e] has no known versions!
 └─restricted to versions 0.4 by SymbolicRegression [8254be44] — no versions left
   └─SymbolicRegression [8254be44] log:
     ├─possible versions are: 0.14.4 or uninstalled
     └─SymbolicRegression [8254be44] is fixed to version 0.14.4' occurred while calling julia code:
Pkg.add([sr_spec, clustermanagers_spec], io=stderr)

 % julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.8.3 (2022-11-14)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using Pkg

julia> Pkg.add("DynamicExpressions")
ERROR: The following package names could not be resolved:
 * DynamicExpressions (not found in project, manifest or registry)
Stacktrace:
  [1] pkgerror(msg::String)
    @ Pkg.Types ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/Types.jl:67
  [2] ensure_resolved(ctx::Pkg.Types.Context, manifest::Pkg.Types.Manifest, pkgs::Vector{Pkg.Types.PackageSpec}; registry::Bool)
    @ Pkg.Types ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/Types.jl:952
  [3] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, kwargs::Base.Pairs{Symbol, Base.TTY, Tuple{Symbol}, NamedTuple{(:io,), Tuple{Base.TTY}}})
    @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:264
  [4] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.TTY, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:156
  [5] add(pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:145
  [6] #add#27
    @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined]
  [7] add
    @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined]
  [8] #add#26
    @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:143 [inlined]
  [9] add(pkg::String)
    @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:143
 [10] top-level scope
    @ REPL[2]:1

julia>

The code works properly on Google CoLab.

bug

opened by dbl001 26

Refactor of PySRRegressor

Re Issue #143

Compatibility with scikit-learn should be improved.

Noteable breaking changes for users: PySRRegressor.equations is now called PySRRegressor.equations_

Tests have been updated to allow compatibility with the refactored code but still assess the same functionality. All tests should pass.

Please let me know if there are any concerns or if you would like me to document/explain any of the changes in detail.

opened by tttc3 24
[BUG] conda version breaking
Edit: If you are seeing issues with the conda version, try updating PySR with conda update pysr. The new version fixes an issue related to automatic updating of Julia packages.

The conda-forge jobs which test conda install -c conda-forge pysr are currently breaking. This is even with repeat attempts: https://github.com/MilesCranmer/PySR/actions/workflows/CI_conda_forge.yml. The error:

ImportError: Required dependencies are not installed or built. Run the following code in the Python REPL:

I find this strange, since underlying feedstock has not changed in the meantime, and it seems like the julia feedstock hasn't been updated recently either.

FYI @mkitti @ngam. I will try to look into this a bit later today.
bug
opened by MilesCranmer 23
[Errno 2] No such file or directory

I have installed pysr-0.6.12.post1 and I have been try to run the example.py but after solve some previous closed bug reports a FileNotFoundError occurs. I'm using Windows 10 and Python 3.7 the version of Julia is 1.6.2. The error message is the following.

FileNotFoundError: [Errno 2] No such file or directory: 'hall_of_fame_2021-08-04_230410.180.csv.bkup'
bug

opened by jzsmoreno 21
Performance speed-up options?
Hello Miles! Thank you for open-sourcing this powerful tool! I am working on including PySR in my own research, and running into some performance bottlenecks.

I found regressing a simple equation (e.g. the quick-start example) takes roughly 2 minutes. Ideally, I am aiming to reduce that time to ~30 seconds. Would you give me some pointers on this? Meanwhile, I will try break down the challenge in several pieces:

Activating a new environment at each API call: I noticed that a new Julia (?) environment is created each time I call pysr() api (see terminal output below). Could we keep the environment up so we can skip this process for subsequent calls?

Running on julia -O3 /tmp/tmpe5qmgemh/runfile.jl Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` Updating registry at `~/.julia/registries/General` No Changes to `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` No Changes to `~/anaconda3/envs/rw/lib/python3.7/site-packages/Manifest.toml` Activating environment on workers. Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` Activating Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` Activating Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` Importing installed module on workers...Finished! Started!

If the above wouldn't work, then allowing y to be vector-valued (as mentioned in #35) would be a second-best option! Even better, if we could create a "batched" version of pysr(X, y) api pysr_batched(X, y), such that X and y are python lists, and we return the results in a list as well, so that we only generate one Julia script, and call os.system() once to keep the Julia environment up.

Multi-threading: I noticed that increasing procs from 4 to 8 resulted in slightly longer running time. I am running on a 8-core 16-tread CPU. Did I do something dumb?

I went into pysr/sr.py and added runtests=false flag in line 438 and 440. That saved ~20 seconds.
opened by yxie20 20

[Feature] LaTeX table generator

This generates a booktabs-style LaTeX table for a subset of equations. Here is an example:

import numpy as np
from pysr import PySRRegressor

X = 2 * np.random.randn(100, 5)
y = 2.5382 * np.cos(X[:, 3]) + X[:, 0] ** 2 - 0.5

model = PySRRegressor(
    niterations=80,
    binary_operators=["+", "*"],
    unary_operators=["cos"],
    model_selection="best",
    loss="loss(x, y) = (x - y)^2",  # Custom loss function (julia syntax)
    maxsize=11,
)

model.fit(X, y)

print(model.latex_table(precision=3, include_score=True))

The output of this is:

\begin{table}[h]
\begin{center}
\begin{tabular}{@{}lccc@{}}
\toprule
Equation & Complexity & Loss & Score \\
\midrule
$3.9$ & 1 & 38.9 & 0 \\
$x_{0}^{2}$ & 3 & 3.16 & 1.26 \\
$x_{0}^{2} - 0.257$ & 5 & 3.09 & 0.0105 \\
$x_{0}^{2} + \cos{\left(x_{3} \right)}$ & 6 & 1.26 & 0.898 \\
$x_{0}^{2} + 2.44 \cos{\left(x_{3} \right)}$ & 8 & 0.245 & 0.818 \\
$x_{0}^{2} + 2.54 \cos{\left(x_{3} \right)} - 0.5$ & 10 & 2.28e-13 & 13.9 \\
\bottomrule
\end{tabular}
\end{center}
\end{table}

which renders as:

Leaving include_score set to False will leave out the Score column. Precision can be adjusted to have more or less precise constants.

One can render only a subset of equations by using latex_table([1, 4]) which only includes the 1st and 4th equation in model.equations_.

Edit: it now renders the e-13 as \cdot 10^{-13}

opened by MilesCranmer 19

Set JULIA_PROJECT, use Pkg.add once
Sets JULIA_PROJECT before loading pyjulia so that PyCall.jl can be contained within the pysr environment

Also use Pkg.add in a single step to add both SymbolicRegression.jl and ClusterManagers.jl to the environment at the same time

I likely advised against using the environment variable JULIA_PROJECT in the past. However, I think this may be necessary to avoid interference from other projects if installed within the same environment.
opened by mkitti 15
Windows support

Hi Miles,

first of all, this is awesome. Thanks so much for making this.

A student I'm working with is trying to run PySR under Windows. Is that in principle supported?

PySR's dependencies don't seem to have any issues with Windows, but pysr.pysr throws a FileNotFoundError when accessing /tmp/.hyperparams_{rand_string}.hl'. Seems to be because of the different file system structure under Windows. If this is the only issue, how would you feel about using something like tempfile to generate temporary files in a more OS-independent way?

I am happy to try this and open a PR once it works.

Cheers, Johann
implemented

opened by johannbrehmer 15
[Windows] Always returning the same equation?

I don't know if this is a Windows issue or what (I work on a Linux partition, but I just wanted to play around with this - I haven't actually done serious work Windows for 7 years or so, so I'm at a loss), but after fitting one equation, it's always returning that equation. Even with different data, in a different notebook.

I've looked to see if I could find the julia file it creates - nope. And they're different files every time.

Any ideas?

opened by JQVeenstra 14
[BUG] Pickling error on use of ReLU
I see this error when I try to use the ReLU operator:

PicklingError: Can't pickle relu: attribute lookup relu on __main__ failed

seems like it's implemented in a way that can't be pickled. Should be an easy fix.
bug
opened by MilesCranmer 1
[BUG] *Windows SystemError:
I have done a fresh installation on windows (with pip) and I am running the basic example provided in the Introduction. I am getting a JULIA error. Thanks in advance for any help!

Version:

Julia version [1.8.3]

Python version [3.10.6]

PySR version [0.11.11]

Error message

C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py:1257: UserWarning: Note: it looks like you are running in Jupyter. The progress bar will be turned off. warnings.warn( Traceback (most recent call last):

File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec exec(code, globals, locals)

File "c:\users\gorth\untitled0.py", line 25, in model.fit(X, y)

File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py", line 1792, in fit self._run(X, y, mutated_params, weights=weights, seed=seed)

File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py", line 1652, in run self.raw_julia_state = SymbolicRegression.EquationSearch(

SystemError: <PyCall.jlwrap (in a Julia function called from Python) JULIA: SystemError: opening file "hall_of_fame_2022-12-17_011150.694.csv": Invalid argument Stacktrace: [1] systemerror(p::String, errno::Int32; extrainfo::Nothing) @ Base .\error.jl:176 [2] #systemerror#80 @ .\error.jl:175 [inlined] [3] systemerror @ .\error.jl:175 [inlined] [4] open(fname::String; lock::Bool, read::Nothing, write::Nothing, create::Nothing, truncate::Bool, append::Nothing) @ Base .\iostream.jl:293 [5] open(fname::String, mode::String; lock::Bool) @ Base .\iostream.jl:356 [6] open(fname::String, mode::String) @ Base .\iostream.jl:355 [7] open(::SymbolicRegression.var"#48#77"{Options{typeof(loss), Int64, 0.86, 10}, Vector{PopMember{Float32}}, SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}, ::String, ::Vararg{String}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) @ Base .\io.jl:382 [8] open @ .\io.jl:381 [inlined] [9] EquationSearch(::SymbolicRegression.CoreModule.ProgramConstantsModule.SRThreaded, datasets::Vector{SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}; niterations::Int64, options::Options{typeof(loss), Int64, 0.86, 10}, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:751 [10] EquationSearch(datasets::Vector{SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}; niterations::Int64, options::Options{typeof(loss), Int64, 0.86, 10}, parallelism::String, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:383 [11] EquationSearch(X::Matrix{Float32}, y::Matrix{Float32}; niterations::Int64, weights::Nothing, varMap::Vector{String}, options::Options{typeof(loss), Int64, 0.86, 10}, parallelism::String, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing, multithreaded::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:320 [12] #EquationSearch#21 @ C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:345 [inlined] [13] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Any, NTuple{8, Symbol}, NamedTuple{(:weights, :niterations, :varMap, :options, :numprocs, :parallelism, :saved_state, :addprocs_function), Tuple{Nothing, Int64, Vector{String}, Options{typeof(loss), Int64, 0.86, 10}, Nothing, String, Nothing, Nothing}}}) @ Base .\essentials.jl:731 [14] pyjlwrap_call(f::Function, args::Ptr{PyCall.PyObject_struct}, kw::Ptr{PyCall.PyObject_struct}) @ PyCall C:\Users\gorth.julia\packages\PyCall\ygXW2\src\callback.jl:32 [15] pyjlwrap_call(self_::Ptr{PyCall.PyObject_struct}, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct}) @ PyCall C:\Users\gorth.julia\packages\PyCall\ygXW2\src\callback.jl:44>
bug
opened by trifinos 13

Repeated CI failures on Windows

Many of the Windows tests are now failing with various segmentation faults, which appear to be randomly triggered:

Nightly action: https://github.com/MilesCranmer/PySR/actions/workflows/CI_large_nightly.yml

PR action: https://github.com/MilesCranmer/PySR/pull/237

They seem to occur more frequently on older versions of Julia, and rarely on Julia 1.8.3. Regardless, a segfault anywhere is cause for concern and should be tracked down.

The errors include:

Early segmentation fault (Julia 1.6.7) at first run, segfault during noise test (Julia 1.6.7 and others), as well as segfaults during warm start test.

e.g., Windows:

D:\a\_temp\221410f9-8bf7-4099-901d-eb9813d86c45.sh: line 1: 1098 Segmentation fault python -m pysr.test main Started!

also occurs on Ubuntu sometimes:

signal (11): Segmentation fault in expression starting at none:0 unknown function (ip: 0x7fd6a19bc215) unknown function (ip: 0x7fd6a19947ff) macro expansion at /home/runner/.julia/packages/PyCall/ygXW2/src/exception.jl:95 [inlined] convert at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:94 pyjlwrap_getattr at /home/runner/.julia/packages/PyCall/ygXW2/src/pytype.jl:378 unknown function (ip: 0x7fd68d30b1bd) unknown function (ip: 0x7fd6a19babda) unknown function (ip: 0x7fd6a198e9d4) pyisinstance at /home/runner/.julia/packages/PyCall/ygXW2/src/PyCall.jl:170 [inlined] pysequence_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:752 pytype_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:773 pytype_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:806 [inlined] convert at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:831 julia_kwarg at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:19 [inlined] #57 at ./none:0 [inlined] iterate at ./generator.jl:47 [inlined] collect_to! at ./array.jl:728 unknown function (ip: 0x7fd68d341d9a) _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419 collect_to! at ./array.jl:736 unknown function (ip: 0x7fd68d33e35a) _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419 collect_to! at ./array.jl:736 collect_to_with_first! at ./array.jl:706 unknown function (ip: 0x7fd68d33d775) _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419 collect at ./array.jl:687 unknown function (ip: 0x7fd68d33afb4) _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419 _pyjlwrap_call at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:31 unknown function (ip: 0x7fd68d3348d5) _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined] jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419 pyjlwrap_call at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:44 unknown function (ip: 0x7fd68d30aeee) unknown function (ip: 0x7fd6a19980c7) _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined] _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined] PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined] call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined] _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3537 unknown function (ip: 0x7fd6a19ebbb7) _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396 unknown function (ip: 0x7fd6a199a1e0) unknown function (ip: 0x7fd6a19ed97b) unknown function (ip: 0x7fd6a19ebbb7) _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396 unknown function (ip: 0x7fd6a19ecdf6) unknown function (ip: 0x7fd6a1998972) unknown function (ip: 0x7fd6a199a1e0) unknown function (ip: 0x7fd6a19ecb12) unknown function (ip: 0x7fd6a1998972) unknown function (ip: 0x7fd6a19ecdf6) unknown function (ip: 0x7fd6a19ebbb7) _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396 unknown function (ip: 0x7fd6a199a28d) unknown function (ip: 0x7fd6a19ef9b1) unknown function (ip: 0x7fd6a19ebbb7) unknown function (ip: 0x7fd6a1997d4c) unknown function (ip: 0x7fd6a1998f2b) unknown function (ip: 0x7fd6a1a46421) unknown function (ip: 0x7fd6a199802f) _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined] _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined] PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined] call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined] _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3520 unknown function (ip: 0x7fd6a19ebbb7) _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396 unknown function (ip: 0x7fd6a199a28d) unknown function (ip: 0x7fd6a19ef9b1) unknown function (ip: 0x7fd6a19ebbb7) unknown function (ip: 0x7fd6a1997d4c) unknown function (ip: 0x7fd6a1998f2b) unknown function (ip: 0x7fd6a1a46421) unknown function (ip: 0x7fd6a199802f) _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined] _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined] PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined] call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined] _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3520 unknown function (ip: 0x7fd6a1998972) unknown function (ip: 0x7fd6a19ecdf6) unknown function (ip: 0x7fd6a1998972) unknown function (ip: 0x7fd6a19ecb12) unknown function (ip: 0x7fd6a19ebbb7) _PyEval_EvalCodeWithName at /home/runner/work/_temp/SourceCode/Python/ceval.c:4361 unknown function (ip: 0x7fd6a19eb876) PyEval_EvalCode at /home/runner/work/_temp/SourceCode/Python/ceval.c:828 unknown function (ip: 0x7fd6a1a6399f) cfunction_vectorcall_FASTCALL at /home/runner/work/_temp/SourceCode/Objects/methodobject.c:430 unknown function (ip: 0x7fd6a19ecb12) unknown function (ip: 0x7fd6a19ebbb7) _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396 unknown function (ip: 0x7fd6a19ecb12) unknown function (ip: 0x7fd6a19ebbb7) _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396 unknown function (ip: 0x7fd6a1a7fdd6) unknown function (ip: 0x7fd6a1a7faae) Py_BytesMain at /home/runner/work/_temp/SourceCode/Modules/main.c:731 unknown function (ip: 0x7fd6a1642d8f) __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) _start at python (unknown line) Allocations: 185387713 (Pool: 185351460; Big: 36253); GC: 470 /home/runner/work/_temp/bdd49862-48fd-4e82-bed8-685329606248.sh: line 1: 2324 Segmentation fault (core dumped) python -m pysr.test main

Git errors: (Julia 1.8.2)

PyCall is installed and built successfully. Cloning git-repo `[https://github.com/MilesCranmer/SymbolicRegression.jl`](https://github.com/MilesCranmer/SymbolicRegression.jl%60) Traceback (most recent call last): File "<string>", line 1, in <module> File "/Users/runner/work/PySR/PySR/pysr/julia_helpers.py", line 87, in install _add_sr_to_julia_project(Main, io_arg) File "/Users/runner/work/PySR/PySR/pysr/julia_helpers.py", line 240, in _add_sr_to_julia_project Main.eval(f"Pkg.add([sr_spec, clustermanagers_spec], {io_arg})") File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 627, in eval ans = self._call(src) File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 555, in _call self.check_exception(src) File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 609, in check_exception raise JuliaError(u'Exception \'{}\' occurred while calling julia code:\n{}' julia.core.JuliaError: Exception 'failed to clone from https://github.com/MilesCranmer/SymbolicRegression.jl, error: GitError(Code:ERROR, Class:Net, SecureTransport error: connection closed via error)' occurred while calling julia code: Pkg.add([sr_spec, clustermanagers_spec], io=stderr)

Access errors during scikit-learn tests (these ones don't even fail the CI, which is a bit worrisome)

e.g.,

Failed check_fit2d_predict1d with: Traceback (most recent call last): File "D:\a\PySR\PySR\pysr\test\test.py", line 671, in test_scikit_learn_compatibility check(model) File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\sklearn\utils\_testing.py", line 188, in wrapper return fn(*args, **kwargs) File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\sklearn\utils\estimator_checks.py", line 1300, in check_fit2d_predict1d estimator.fit(X, y) File "D:\a\PySR\PySR\pysr\sr.py", line 1792, in fit self._run(X, y, mutated_params, weights=weights, seed=seed) File "D:\a\PySR\PySR\pysr\sr.py", line 1493, in _run Main = init_julia(self.julia_project, julia_kwargs=julia_kwargs) File "D:\a\PySR\PySR\pysr\julia_helpers.py", line 180, in init_julia Julia(**julia_kwargs) File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\julia\core.py", line 519, in __init__ self._call("const PyCall = Base.require({0})".format(PYCALL_PKGID)) File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\julia\core.py", line 554, in _call ans = self.api.jl_eval_string(src.encode('utf-8')) OSError: exception: access violation reading 0x000001BC1C501000

Torch errors.

One other curious thing is that this error is raised on some Windows tests (https://github.com/MilesCranmer/PySR/actions/runs/3664894286/jobs/6195713513). But, this should not take place...

Run python -m pysr.test torch D:\a\PySR\PySR\pysr\julia_helpers.py:139: UserWarning: `torch` was loaded before the Julia instance started. This may cause a segfault when running `PySRRegressor.fit`. To avoid this, please run `pysr.julia_helpers.init_julia()` *before* importing `torch`. For updates, see https://github.com/pytorch/pytorch/issues/78829 warnings.warn( D:\a\_temp\8727c9f4-d0f6-4345-84e6-e774762771ab.sh: line 1: 258 Segmentation fault python -m pysr.test torch Started!

opened by MilesCranmer 11
Raise warning on statically-linked Python binaries

Time-to-first-search is very slow on statically-linked versions of Python (such as packaged with conda), as precompiled code cannot be used, so things are compiled from scratch. I think this adds some friction to the user experience, so this PR introduces a warning that recommends the user try pyenv if startup time is important.

When https://github.com/JuliaPy/pyjulia/issues/496 is solved, this warning is no longer needed.

See https://github.com/conda-forge/python-feedstock/issues/222 for the discussion on the conda page.

opened by MilesCranmer 3
[Feature] Install with CLI
Right now you install SymbolicRegression.jl using python -c 'import pysr; pysr.install()'. However, this is a bit of spooky action at a distance, because you can't quite be sure which pysr is actually being called. Thus, it would be great if there was a CLI, similar to how testing is done with python -m pysr.test main. For example:

python -m pysr.install

If anybody wants to add this, I'd be more than happy to accept a PR!
enhancement
opened by MilesCranmer 0

Releases(v0.11.11)

v0.11.11(Nov 22, 2022)
What's Changed

Make Julia startup options configurable; set optimize=3 by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/228

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.10...v0.11.11
Source code(tar.gz)
Source code(zip)
v0.11.10(Nov 21, 2022)
What's Changed

Clean up dockerfile by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/223

Update backend version with improved resource monitoring by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/227

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.9...v0.11.10
Source code(tar.gz)
Source code(zip)
v0.11.9(Nov 5, 2022)
What's Changed

Refactor testing suite to have CLI by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/221

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.8...v0.11.9
Source code(tar.gz)
Source code(zip)
v0.11.8(Nov 4, 2022)
What's Changed

Fix PyCall not giving traceback by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/218

Fixed safe operators; make progress bar print to stderr by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/219

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.7...v0.11.8
Source code(tar.gz)
Source code(zip)
v0.11.7(Nov 4, 2022)
What's Changed

Expand nightly conda-forge tests to other Python versions by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/212

Clean up parameter groupings in docs by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/214

Add optimization-as-mutation, and adaptive parsimony by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/217

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.6...v0.11.7
Source code(tar.gz)
Source code(zip)
v0.11.6(Oct 31, 2022)
What's Changed

Speed up evaluation with turbo parameter by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/208

https://user-images.githubusercontent.com/7593028/199054602-7ad19e87-19ff-4440-aa09-da6d7b6175d5.mp4

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.5...v0.11.6
Source code(tar.gz)
Source code(zip)
v0.11.5(Oct 24, 2022)
What's Changed

30-50% Faster evaluation, and perform explicit version assertion for backend by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/205

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.4...v0.11.5
Source code(tar.gz)
Source code(zip)
v0.11.4(Oct 10, 2022)
What's Changed

Fix conda forge installs by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/202

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.3...v0.11.4
Source code(tar.gz)
Source code(zip)
v0.11.3(Oct 6, 2022)
What's Changed

Faster evaluation for constant sub-expressions (SymbolicRegression.jl#129)

Will now check variable names for spaces and other non-alphanumeric characters, aside from underscores. Before this would only raise an issue after a search, when trying to pickle the saved data.

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.2...v0.11.3
Source code(tar.gz)
Source code(zip)
v0.11.2(Sep 28, 2022)

(Fix for conda-forge build)

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.1...v0.11.2
Source code(tar.gz)
Source code(zip)
v0.11.1-1(Sep 26, 2022)
What's Changed

Added Customization page in the docs for tweaking the backend's loss function and constraints.

Adding two entries to papers.yml by @JayWadekar in https://github.com/MilesCranmer/PySR/pull/192

Explicitly deprecate Julia <= 1.5 by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/194

Allow custom shared projects for julia_project by @MilesCranmer @mkitti in https://github.com/MilesCranmer/PySR/pull/197

e.g., this would allow you to run with @my-project and it will set up a shared Julia project under my-project (in the environments dir)

New Contributors

@JayWadekar made their first contribution in https://github.com/MilesCranmer/PySR/pull/192

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.0...v0.11.1-1
Source code(tar.gz)
Source code(zip)
v0.11.0(Sep 11, 2022)
What's Changed

Update backend https://github.com/MilesCranmer/PySR/pull/191

Includes high-precision constants when precision=64

Enables datasets with zero variance (to allow fitting a constant)

Changes, e.g., abs(x)^y to x^y, with expressions avoided altogether for invalid input. This is because the former would sometimes give weird functional forms by exploiting the cusp at x=0. Thanks to @johanbluecreek.

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.4...v0.11.0
Source code(tar.gz)
Source code(zip)
v0.10.4-1(Sep 8, 2022)
What's Changed

Fix install for Julia <=1.6 by @MilesCranmer @mkitti in https://github.com/MilesCranmer/PySR/pull/188

PyJulia will now launch directly into the shared pysr-{version} environment, rather than activating it later.

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.3...v0.10.4
Source code(tar.gz)
Source code(zip)
v0.10.3(Sep 6, 2022)
What's Changed

Displays a warning message when PyTorch is imported before PyJulia starts. See https://github.com/pytorch/pytorch/issues/78829. The only current solution is to start Julia beforehand.

New docs! Using Material-Mkdocs:

Source code(tar.gz)
Source code(zip)
v0.10.2(Sep 6, 2022)
What's Changed

Set JULIA_PROJECT, use Pkg.add once by @mkitti in https://github.com/MilesCranmer/PySR/pull/186

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.1...v0.10.2
Source code(tar.gz)
Source code(zip)
v0.10.1(Sep 6, 2022)

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.0...v0.10.1
Source code(tar.gz)
Source code(zip)
v0.10.0(Aug 14, 2022)
What's Changed

Easy loading from auto-generated checkpoint files by @MilesCranmer w/ review @tttc3 @Pablo-Lemos in https://github.com/MilesCranmer/PySR/pull/167

Use .from_file to load from the auto-generated .pkl file.

LaTeX table generator by @MilesCranmer w/ review @tttc3 @kazewong in https://github.com/MilesCranmer/PySR/pull/156

Generate a LaTeX table of discovered equations with .latex_table()

Improved default model selection strategy by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/177

Old strategy is available as model_selection="score"

Add opencontainers image-spec to Dockerfile by @SauravMaheshkar w/ review @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/166

Switch to comma-based csv format by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/176

Bug fixes

Fixed conversions to torch and JAX when a rational number appears in the sympy expression (https://github.com/MilesCranmer/PySR/commit/17c9b1a1762efbd8e021d275491f75cc6dcea8f1, https://github.com/MilesCranmer/PySR/commit/f119733698e4517e34cc902c78dcb95d450c0c80)

Fixed pickle saving when trained with multi-output (https://github.com/MilesCranmer/PySR/commit/3da0df512ee295f446ceb0ae6e2c39fb0e380618)

Fixed pickle saving when using custom operators with defined sympy -> jax/torch/numpy mappings

Backend fix avoids use of Julia's cp which is buggy for some file systems (e.g., EOS)

New Contributors

@SauravMaheshkar made their first contribution in https://github.com/MilesCranmer/PySR/pull/166

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.9.0...v0.10.0
Source code(tar.gz)
Source code(zip)
v0.9.0(Jun 4, 2022)
What's Changed

Refactor of PySRRegressor by @tttc3 in https://github.com/MilesCranmer/PySR/pull/146

PySRRegressor is now completely compatible with scikit-learn.

PySRRegressor can be stored in a pickle file, even after fitting, and then be reloaded and used with .predict()

PySRRegressor.equations -> PySRRegressor.equations_

New Contributors

@tttc3 made their first contribution in https://github.com/MilesCranmer/PySR/pull/146

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.8.7...v0.9.0
Source code(tar.gz)
Source code(zip)
v0.8.5(May 20, 2022)
What's Changed

Custom complexities for operators, constants, and variables (https://github.com/MilesCranmer/PySR/pull/138)

Early stopping conditions (https://github.com/MilesCranmer/PySR/pull/134)

Based on a certain loss value being achieved

Max number of evaluations (for theoretical studies of genetic algorithms, rather than anything practical).

Work with specified expression rather than the one given by model_selection, by passing index to the function you wish to use (e.g,. model.predict(X, index=5) would use the 5th equation.).

Full Changelog since v0.8.1: https://github.com/MilesCranmer/PySR/compare/v0.8.1...v0.8.5
Source code(tar.gz)
Source code(zip)
v0.8.1(May 8, 2022)
What's Changed

Enable distributed processing with ClusterManagers.jl from https://github.com/MilesCranmer/PySR/pull/133

Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.8.0...v0.8.1
Source code(tar.gz)
Source code(zip)
v0.8.0(May 8, 2022)
This new release updates the entire set of default PySR parameters according to the ones presented in https://github.com/MilesCranmer/PySR/discussions/115. These parameters have been tuned over nearly 71,000 trials. See the discussion for further info.

Additional changes:

Nested constraints implemented. For example, you can now prevent sin and cos from being repeatedly nested, by using the argument: nested_constraints={"sin": {"sin": 0, "cos": 0}, "cos": {"sin": 0, "cos": 0}}. This argument states that within a sin operator, you can only have a max depth of 0 for other sin or cos. The same is done for cos. The argument nested_constraints={"^": {"+": 2, "*": 1, "^": 0}} states that within a pow operator, you can only have 2 things added, or 1 use of multiplication (i.e., no double products), and zero other pow operators. This helps a lot with finding interpretable expressions!

New parsimony algorithm (backend change). This seems to help searches quite a bit, especially when one is searching for more complex expressions. This is turned on by use_frequency_in_tournament which is now the default.

Many backend improvements: speed, bug fixes, etc.

Improved stability of multi-processing (backend change). Thanks to @CharFox1.

Auto-differentiation implemented (backend change). This isn't used by default in any instances right now, but could be used by optimization later. Thanks to @kazewong.

Improved testing coverage of weird edge cases.

All parameters to PySRRegressor have been cleaned up to be in snake_case rather than CamelCase. The backend is also now almost entirely snake_case for internal functions. +Other readability improvements. Thanks to @bstollnitz and @patrick-kidger for the suggestions.

Source code(tar.gz)
Source code(zip)
v0.6.0(Jun 1, 2021)
PySR Version 0.6.0

Large changes:

Exports to JAX, PyTorch, NumPy. All exports have a similar interface. JAX and PyTorch allow the equation parameters to be trained (e.g., as part of some differentiable model). Read https://pysr.readthedocs.io/en/latest/docs/options/#callable-exports-numpy-pytorch-jax for details. Thanks Patrick Kidger for the PyTorch export.

Multi-output y input is allowed, and the backend will efficiently batch over each output. A list of dataframes is returned by pysr for these cases. All best_* functions return a list as well.

BFGS optimizer introduced + more stable parameter search due to back tracking line search.

Smaller changes since 0.5.16:

Expanded tests, coverage calculation for PySR

Improved (pre-processing) feature selection with random forest

New default parameters for search:

annealing=False (no annealing works better with the new code. This is equivalent to alpha=infinity)

useFrequency=True (deals with complexity in a smarter way)

npopulations = 20 ~~procs*4~~

progress=True (show a progress bar)

optimizer_algorithm="BFGS"

optimizer_iterations=10

optimize_probability=1

binary_operators default = ["+", "-", "/", "*"]

unary_operators default = []

Warnings:

Using maxsize > 40 will trigger a warning mentioning how it will be slow and use a lot of memory. Will mention to turn off useFrequency, and perhaps also use warmupMaxsizeBy.

Deprecated nrestarts -> optimizer_nrestarts

Printing fixed in Jupyter

Source code(tar.gz)
Source code(zip)
v0.4.0(Feb 1, 2021)

With versions v0.4.0/v0.4.0, SymbolicRegression.jl and PySR have now been completely disentangled: PySR is 100% Python code (with some Julia meta-programming), and SymbolicRegression.jl is 100% Julia code.

PySR now works by activating a Julia env that has SymbolicRegression.jl as a dependency, and making calls to it! By default it will set up a Julia project inside the pip install location, and install requirements at the user's confirmation, though you can pass an arbitrary project directory as well (e.g., if you want to use PySR but also tweak the backend). The nice thing about this is that for Python users, all you need to do is install a Julia binary somewhere, and they should be good to go. And for Julia users, you never need to touch the Python side.

The SymbolicRegression.jl backend also sets up workers automatically & internally now, so one never needs to call @everywhere when setting things up. The same is true even with locally-defined functions - these get passed to workers!

With PySR importing the latest Julia code, this also means it gets new simplification routines powered by SymbolicUtils.jl, which seem to help improve the equations discovered.
Source code(tar.gz)
Source code(zip)
v0.3.8(Sep 27, 2020)

Populations don't block eachother, which gives a large speedup especially for large numbers of populations. This was fixed by using RemoteChannel() in Julia.

Some populations happen to take longer than others - perhaps they have very complex equations - and can therefore block others that have finished early. This lets the processor work on the next population to be finished.
Source code(tar.gz)
Source code(zip)
v0.3.5(Sep 27, 2020)

Uses equation from Cranmer et al. (2020) https://arxiv.org/abs/2006.11287 to score equations, and prints this alongside MSE. This makes symbolic regression more robust to noise.
Source code(tar.gz)
Source code(zip)
v0.2(Sep 21, 2020)

Source code(tar.gz)
Source code(zip)

Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

Related tags

Overview

Installation

Quickstart

Comments

Releases(v0.11.11)

v0.11.11(Nov 22, 2022)

What's Changed

v0.11.10(Nov 21, 2022)

What's Changed

v0.11.9(Nov 5, 2022)

What's Changed

v0.11.8(Nov 4, 2022)

What's Changed

v0.11.7(Nov 4, 2022)

What's Changed

v0.11.6(Oct 31, 2022)

What's Changed

v0.11.5(Oct 24, 2022)

What's Changed

v0.11.4(Oct 10, 2022)

What's Changed

v0.11.3(Oct 6, 2022)

What's Changed

v0.11.2(Sep 28, 2022)

v0.11.1-1(Sep 26, 2022)

What's Changed

New Contributors

v0.11.0(Sep 11, 2022)

What's Changed

v0.10.4-1(Sep 8, 2022)

What's Changed

v0.10.3(Sep 6, 2022)

What's Changed

v0.10.2(Sep 6, 2022)

What's Changed

v0.10.1(Sep 6, 2022)

v0.10.0(Aug 14, 2022)

What's Changed

Bug fixes

New Contributors

v0.9.0(Jun 4, 2022)

What's Changed

New Contributors

v0.8.5(May 20, 2022)

What's Changed

v0.8.1(May 8, 2022)

What's Changed

v0.8.0(May 8, 2022)

v0.6.0(Jun 1, 2021)

v0.4.0(Feb 1, 2021)

v0.3.8(Sep 27, 2020)

v0.3.5(Sep 27, 2020)

v0.2(Sep 21, 2020)

Owner

Miles Cranmer

scikit-fem is a lightweight Python 3.7+ library for performing finite element assembly.

Simulation of early COVID-19 using SIR model and variants (SEIR ...).

Add built-in support for quaternions to numpy

Predicting diabetes over a five year period using logistic regression and the Pima First-Nation dataset

Predict the income for each percentile of the population (Python) - FRENCH

[HELP REQUESTED] Generalized Additive Models in Python

Machine Learning University: Accelerated Natural Language Processing Class

Simple Machine Learning Tool Kit

Practical Time-Series Analysis, published by Packt

A linear regression model for house price prediction

PySpark ML Bank Churn Prediction

My project contrasts K-Nearest Neighbors and Random Forrest Regressors on Real World data

Python 3.6+ toolbox for submitting jobs to Slurm

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Responsible Machine Learning with Python

2021 Machine Learning Security Evasion Competition

Time series forecasting with PyTorch

Decision tree is the most powerful and popular tool for classification and prediction

A Python package to preprocess time series