Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

Overview

numbalsoda

numbalsoda is a python wrapper to the LSODA method in ODEPACK, which is for solving ordinary differential equation initial value problems. LSODA was originally written in Fortran. numbalsoda is a wrapper to a C++ re-write of the original code: https://github.com/dilawar/libsoda

numbalsoda also wraps the dop853 explicit Runge-Kutta method from this repository: https://github.com/jacobwilliams/dop853

This package is very similar to scipy.integrate.solve_ivp (see here), when you set method = 'LSODA' or method = DOP853. But, scipy.integrate.solve_ivp invokes the python interpreter every time step which can be slow. Also, scipy.integrate.solve_ivp can not be used within numba jit-compiled python functions. In contrast, numbalsoda never invokes the python interpreter during integration and can be used within a numba compiled function which makes numbalsoda a lot faster than scipy for most problems, and achieves similar performance to Julia's DifferentialEquations.jl in some cases (see benchmark folder).

Installation

Conda:

conda install -c conda-forge numbalsoda

Pip:

python -m pip install numbalsoda

Basic usage

from numbalsoda import lsoda_sig, lsoda, dop853
from numba import njit, cfunc
import numpy as np

@cfunc(lsoda_sig)
def rhs(t, u, du, p):
    du[0] = u[0]-u[0]*u[1]
    du[1] = u[0]*u[1]-u[1]*p[0]

funcptr = rhs.address # address to ODE function
u0 = np.array([5.,0.8]) # Initial conditions
data = np.array([1.0]) # data you want to pass to rhs (data == p in the rhs).
t_eval = np.linspace(0.0,50.0,1000) # times to evaluate solution

# integrate with lsoda method
usol, success = lsoda(funcptr, u0, t_eval, data = data)

# integrate with dop853 method
usol1, success1 = dop853(funcptr, u0, t_eval, data = data)

# usol = solution
# success = True/False

The variables u, du and p in the rhs function are pointers to an array of floats. Therefore, operations like np.sum(u) or len(u) will not work. However, you can use the function nb.carray() to make a numpy array out of the pointers. For example:

import numba as nb

@cfunc(lsoda_sig)
def rhs(t, u, du, p):
    u_ = nb.carray(u, (2,))
    p_ = nb.carray(p, (1,))
    # ... rest of rhs goes here using u_ and p_

Above, u_ and p_ are numpy arrays build out of u and p, and so functions like np.sum(u_) will work.

Also, note lsoda can be called within a jit-compiled numba function (see below). This makes it much faster than scipy if a program involves many integrations in a row.

@njit
def test():
    usol, success = lsoda(funcptr, u0, t_eval, data = data)
    return usol
usol = test() # this works!

@njit
def test_sp():
    sol = solve_ivp(f_scipy, t_span, u0, t_eval = t_eval, method='LSODA')
    return sol
sol = test_sp() # this does not work :(

Passing data to the right-hand-side function

In the examples shown above, we passed a an single array of floats to the right-hand-side function:

# ...
data = np.array([1.0])
usol, success = lsoda(funcptr, u0, t_eval, data = data)

However, sometimes you might want to pass more data types than just floats. For example, you might want to pass several integers, an array of floats, and an array of integers. One way to achieve this is with generating the cfunc using a function like this:

def make_lsoda_func(param1, param2, param3):
    @cfunc(lsoda_sig)
    def rhs(t, x, du, p):
        # Here param1, param2, and param3
        # can be accessed.
        du[0] = param1*t
        # etc...
    return rhs
    
rhs = make_lsoda_func(10.0, 5, 10000)
funcptr = rhs.address
# etc...

The only drawback of this approach is if you want to do many successive integrations where the parameters change because it would required re-compiling the cfunc between each integration. This could be slow.

But! It is possible to pass arbitrary parameters without re-compiling the cfunc, but it is a little tricky. The notebook passing_data_to_rhs_function.ipynb gives an example that explains how.

Comments
  • Installation with pip fails

    Installation with pip fails

    Good evening,

    I am working on a project involving the long-time integration of an ODE, and the results you show definitely look very promising, but I cannot install the package ...

    I am working on mac os X v12.4 with a M1 chip. Python v3.10.8, pip v22.3.1, setuptools v65.5.1.

    To really test only the install of numbalsoda and remove all possible side effects from pre-installed packages, I put myself in a new virtualenv. πŸ”΄ I am getting the following error when doing pip install numbalsoda:

    pip install numbalsoda
    Collecting numbalsoda
      Downloading numbalsoda-0.3.4.tar.gz (241 kB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 241.3/241.3 kB 2.8 MB/s eta 0:00:00
      Installing build dependencies ... done
      Getting requirements to build wheel ... done
      Preparing metadata (pyproject.toml) ... done
    Collecting numpy
      Downloading numpy-1.23.5-cp310-cp310-macosx_11_0_arm64.whl (13.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.4/13.4 MB 9.5 MB/s eta 0:00:00
    Collecting numba
      Downloading numba-0.56.4-cp310-cp310-macosx_11_0_arm64.whl (2.4 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.4/2.4 MB 9.4 MB/s eta 0:00:00
    Requirement already satisfied: setuptools in ./py3env/lib/python3.10/site-packages (from numba->numbalsoda) (65.5.1)
    Collecting llvmlite<0.40,>=0.39.0dev0
      Downloading llvmlite-0.39.1-cp310-cp310-macosx_11_0_arm64.whl (23.1 MB)
         ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.1/23.1 MB 10.3 MB/s eta 0:00:00
    Building wheels for collected packages: numbalsoda
      Building wheel for numbalsoda (pyproject.toml) ... error
      error: subprocess-exited-with-error
      
      Γ— Building wheel for numbalsoda (pyproject.toml) did not run successfully.
      β”‚ exit code: 1
      ╰─> [73 lines of output]
          /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/setuptools/dist.py:770: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
            warnings.warn(
          /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/setuptools/config/setupcfg.py:508: SetuptoolsDeprecationWarning: The license_file parameter is deprecated, use license_files instead.
            warnings.warn(msg, warning_class)
          
          
          --------------------------------------------------------------------------------
          -- Trying "Ninja" generator
          --------------------------------
          ---------------------------
          ----------------------
          -----------------
          ------------
          -------
          --
          Not searching for unused variables given on the command line.
          -- The C compiler identification is AppleClang 13.1.6.13160021
          -- Detecting C compiler ABI info
          -- Detecting C compiler ABI info - done
          -- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang - skipped
          -- Detecting C compile features
          -- Detecting C compile features - done
          -- The CXX compiler identification is AppleClang 13.1.6.13160021
          -- Detecting CXX compiler ABI info
          -- Detecting CXX compiler ABI info - done
          -- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ - skipped
          -- Detecting CXX compile features
          -- Detecting CXX compile features - done
          -- Configuring done
          -- Generating done
          -- Build files have been written to: /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_cmake_test_compile/build
          --
          -------
          ------------
          -----------------
          ----------------------
          ---------------------------
          --------------------------------
          -- Trying "Ninja" generator - success
          --------------------------------------------------------------------------------
          
          Configuring Project
            Working directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build
            Command:
              cmake /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57 -G Ninja -DCMAKE_INSTALL_PREFIX:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.8 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPYTHON_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPYTHON_LIBRARY:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/lib/libpython3.10.dylib -DPython_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython_FIND_REGISTRY:STRING=NEVER -DPython3_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython3_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython3_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython3_FIND_REGISTRY:STRING=NEVER -DCMAKE_MAKE_PROGRAM:FILEPATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DSKBUILD=ON -DCMAKE_BUILD_TYPE:STRING=Release -DCMAKE_OSX_DEPLOYMENT_TARGET:STRING=12.0 -DCMAKE_OSX_ARCHITECTURES:STRING=arm64
          
          CMake Error at /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineFortranCompiler.cmake:33 (message):
            Could not find compiler set in environment variable FC:
          
            /usr/local/bin/gfortran.
          Call Stack (most recent call first):
            CMakeLists.txt:3 (project)
          
          
          CMake Error: CMAKE_Fortran_COMPILER not set, after EnableLanguage
          CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
          -- Configuring incomplete, errors occurred!
          See also "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build/CMakeFiles/CMakeOutput.log".
          Traceback (most recent call last):
            File "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/setuptools_wrap.py", line 632, in setup
              env = cmkr.configure(
            File "/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/cmaker.py", line 334, in configure
              raise SKBuildError(
          
          An error occurred while configuring with CMake.
            Command:
              cmake /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57 -G Ninja -DCMAKE_INSTALL_PREFIX:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.8 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPYTHON_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPYTHON_LIBRARY:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/lib/libpython3.10.dylib -DPython_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython_FIND_REGISTRY:STRING=NEVER -DPython3_EXECUTABLE:PATH=/Users/tbridel/Documents/1-CODES/py3env/bin/python -DPython3_ROOT_DIR:PATH=/Users/tbridel/Documents/1-CODES/py3env -DPython3_INCLUDE_DIR:PATH=/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.10/include/python3.10 -DPython3_FIND_REGISTRY:STRING=NEVER -DCMAKE_MAKE_PROGRAM:FILEPATH=/private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-build-env-tb3aezox/overlay/lib/python3.10/site-packages/ninja/data/bin/ninja -DSKBUILD=ON -DCMAKE_BUILD_TYPE:STRING=Release -DCMAKE_OSX_DEPLOYMENT_TARGET:STRING=12.0 -DCMAKE_OSX_ARCHITECTURES:STRING=arm64
            Source directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57
            Working directory:
              /private/var/folders/8_/vb42yyg57lj90txl41w_m4ph0000gn/T/pip-install-s6p_6_f6/numbalsoda_aa6a60982d73407590bdf13f5c02eb57/_skbuild/macosx-12.0-arm64-3.10/cmake-build
          Please see CMake's output for more information.
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
      ERROR: Failed building wheel for numbalsoda
    Failed to build numbalsoda
    ERROR: Could not build wheels for numbalsoda, which is required to install pyproject.toml-based projects
    

    πŸ”΄ This error also occurs with numbalsoda v0.3.3.

    🟒 The pip install runs fine for numbalsoda v0.2.1 however.

    Could you please advise @Nicholaswogan ?

    Thank you very much in advance ! Best, T. Bridel-Bertomeu

    opened by tbridel 5
  • [lsoda] 500 steps taken before reaching tout

    [lsoda] 500 steps taken before reaching tout

    I am running a simulation and if I understand correctly the maximum number of steps (500) is performed before the integration finishes. Is there a way to change this parameter?

    Thanks

    opened by nmourdo 5
  • Can I use 2d-array of 'u0'?

    Can I use 2d-array of 'u0'?

    I want to use the 2d-shape initial values of the multiple-particles system. But, I think I cannot use the slicing or indexing.

    def swing_lsoda(network):
        """
        rhs = swing_cover(network)
        funcptr = rhs.address
        ...
        data = np.array([1.0])
        usol, success = lsoda(funcptr, u0, t_eval, data = data)
        """
        @nb.cfunc(lsoda_sig) # network
        def rhs(t, u, du, p): # p0=m, p1=gamma, p2=P, p3=K
            # u is 2d-array 
            Interaction = p[3] * SinIntE(network, u[0])
            du[0] = u[1] #Thetas of nodes
            du[1] = 1/p[0]*(p[2] - p[1]*u[1] - Interaction) #Omega 
    

    How can I try it?

    opened by kimyoungjin06 4
  • Support for other integrators

    Support for other integrators

    Hi, thanks for this great package!

    I know this package is only supposed to deal with the lsoda integrator, but I was wondering if you could at least add support for an explicit integrator as well, maybe rk23. The reason I ask is that I want to solve a very large system of ODEs (>10^6), and because of that, implicit integrators are not really suitable (since they require a matrix solve Ax=b at each step), so I'm stuck with scipy's rk23 or rk45, which are fine, except that because the integration involves too many time steps, I think solve_ivp's explicit while loop is becoming the bottle-neck.

    opened by ma-sadeghi 4
  • Numba cache and iterative solving

    Numba cache and iterative solving

    Hi,

    I am trying to use NumbaLSODA to solve many systems iteratively. However, it seems that I can't out the function in the numba cache and so I have a big overhead at each iteration. I was wondering if you have an advice about it.

    This is my code. First I put all numba related function inside a file

    numba_func.py

    NOPYTHON = True
    CACHE = True
    PARALLEL = True
    
    @nb.cfunc(lsoda_sig, nopython=NOPYTHON, cache=CACHE)
    def rhs(x, i, di, p):
        di[0] = ...
        di[1] = ...
        di[2] = ...
    
    funcptr = rhs.address
    
    @nb.njit('(float64)(float64, float64, float64, int16)', nopython=NOPYTHON, parallel=PARALLEL, cache=CACHE)
    def solve(a, b, c, funcptr):
    
        w0 = np.array([a, b, ...], dtype=np.float64)
        p  = np.array([c], dtype=np.float64)
    
        x = np.linspace(0, 100, 500)
    
        usol, success = lsoda(funcptr, w0, x, data=p)
        
        return usol[-1][1]
    
    

    Then I use another file to solve the systems one after the others

    
    from numba_func import solve, funcptr
    
    gs = []
    for a, b, c in zip(as, bs, cs):
        gs = np.append(gs, solve(a, b, c, funcptr))
    

    I got the following warning:

    NumbaWarning: Cannot cache compiled function "solve" as it uses dynamic globals (such as ctypes pointers and large global arrays)
    

    I guess the idea is to pass the variable funcptr correctly so that numba is happy but so far I failed...

    opened by edumur 4
  • Strange results when t_eval has repeat values

    Strange results when t_eval has repeat values

    Thanks for the great package!

    Summary: I'm trying to switch over from using scipy's odeint to NumbaLSODA as it seems much better! Sometimes the t_eval array I supply has repeated times at the end and this produces some strange output. This was okay with odeint though, would it be possible to allow it for NumbaLSODA too? I think it is trying to integrate an infinitely small timestep rather than just evaluating the same timestep multiple times.

    Details: I'm trying to add this to LEGWORK so that we can speed up how we evolve the orbits of binary stars due to gravitational wave emission. However, if the evolution gets too close to the merger then LSODA produces a bunch of warnings about the timesteps being too small (since the derivatives get so large). To avoid this, I set any timesteps that are too close to the merger equal to the last allowable timestep - this leads to repeated values and the weird results that I see. (I don't just trim the array since I do this in a 2D array with a collection of binaries and so each individual binary might have a different length array which would wouldn't work in a 2D array I think.)

    MWE: Evolve binary evolution due to gravitational wave emission using Peters (1964)

    from numba import cfunc
    import numba as nb
    import numpy as np
    from NumbaLSODA import lsoda_sig, lsoda
    
    @cfunc(lsoda_sig)
    def de_dt(t, e, de, data):
        e_ = nb.carray(e, (1,))
        beta, c_0 = nb.carray(data, (2,))
        de[0] = -19 / 12 * beta / c_0**4 * (e_[0]**(-29.0 / 19.0) * (1 - e_[0]**2)**(3.0/2.0)) \
            / (1 + (121./304.) * e_[0]**2)**(1181./2299.)
    
    ecc = 0.5
    beta = 2.47e22
    c_0 = 1.677e10
    t_merge = 1.81e17
    
    funcptr = de_dt.address
    u0 = np.array([ecc])
    data = np.array([beta, c_0])
    t_eval = np.array([0.0, 0.1, 0.2, 0.5, 0.8, 0.9, 0.9, 0.9]) * t_merge
    
    ecc_evol, _ = lsoda(funcptr, u0, t_eval=t_eval, data=data)
    print(ecc_evol)
    

    This outputs

    [[ 5.00000000e-01]
     [ 4.83244343e-01]
     [ 4.64840518e-01]
     [ 3.95592532e-01]
     [ 2.82404708e-01]
     [ 2.15245963e-01]
     [-1.36311572e+57]
     [-1.36311572e+57]]
    

    So you can see that as soon as the timesteps repeat, you get this hugely small number (and eccentricity should be between 0 and 1)

    opened by TomWagg 3
  • Cannot import dop853 from numbalsoda

    Cannot import dop853 from numbalsoda

    Hi! I've just installed your library and got this error when running the code shown in the readme. ImportError: cannot import name 'dop853' from 'numbalsoda' Could I be missing a library? Thanks in advance

    opened by samirop 2
  • do you have a tip how to workaround the global variable?

    do you have a tip how to workaround the global variable?

    Hi,

    thanks for the great code, it's very fast.

    I'm struggling though with a need for a global variable, or maybe an argument that is updated by the ODE itself

    @nb.cfunc(lsoda_sig)
    def particle_ode(t, y_, dydt, p_):
        y = nb.carray(y_,(3,))
        p = nb.carray(p_,(11,))
        rho1, rho, nu , Cd, g, rhop, d, zu, zl, Vc0, trec = p
       
        global tzl # <---- here 
    
                               
        zp   = y[0]                # particle position      [m]
        V    = y[1]                # particle velocity      [m/s]
        tzl =  y[2] 
        Cam = 0.5                  # added mass coefficient [-]
    
        # determine Vc
        if (y[0]<= zl):
            Vc = Vc0
            tzl = t     # < ----- here I need to adjust this as we move through with y[0]
        else:
            Vc = Vc0 * np.exp(-1*((t-tzl)/trec))
        FS = (rho(zp) - rho1)*Vc*g
    
        # force balance on the particle 
        dzpdt = V
        dVdt  = ( (rhop-rho(zp)) * Vp(d) * g \
                - 0.5 *Cd(V*d/nu(zp))* rho(zp) * Ap(d) * np.abs(V) * V \
                - FS \
                ) / (rhop*Vp(d) + Cam*rho(zp)*Vp(d))
    
        dydt = [dzpdt, dVdt, 0.0 ]
    funcptr = particle_ode.address
    
    opened by alexlib 2
  • Unexpected behavior of NumbaLSODA and nb.cufnc in parallel mode

    Unexpected behavior of NumbaLSODA and nb.cufnc in parallel mode

    Hi There, Thanks for sharing this package. I am seeing up to 20x speedup in my code. As I am simulating the evolution of multiple initial conditions, I want to speed up the code further by using parallel=True flag in Numba. I adapted the example you provided in Stackoverflow for my code to use NumbaLSODA with parallel=True flag in Numba. However, I observed that the speedup was marginal even with 8 cores . In some cases, my parallel code was actually slower (!).

    After debugging on the 2 state example using the code below, I narrowed the reason down to the way the parameters in the rhs function are defined. if the parameters are either global variables or passed using the data argument in lsoda, the parallel = True speeds up the code almost linearly versus the number of cores. However, if the parameters are defined locally as in f_local below, the speed up is marginal. In fact, the parallel version using f_local is slower than the series version using f_global or f_param.

    I thought that local declarations of variables is a good coding practice and it speeds up code execution, but this does not seem to be the case with Numba and NumbaLSODA. I do not know if this behavior is caused by cfunc in Numba or NumbaLSODA. It was definitely unexpected and I thought I will bring it to your notice. Do you know the reason for this behavior? Are local constants not compiled just as global constants? I am unable to dig deeper into the reason and was wondering if you could help. Best Amar

    from NumbaLSODA import lsoda_sig, lsoda
    from matplotlib import pyplot as plt
    import numpy as np
    import numba as nb
    import time
    
    a_glob=np.array([1.5,1.5])
    
    @nb.cfunc(lsoda_sig)
    def f_global(t, u_, du, p): # variable a is global
        u = nb.carray(u_, (2,))
        du[0] = u[0]-u[0]*u[1]*a_glob[0]
        du[1] = u[0]*u[1]-u[1]*a_glob[1]
    
        
    @nb.cfunc(lsoda_sig)
    def f_local(t, u_, du, p): # variable a is local
        u = nb.carray(u_, (2,))
        a = np.array([1.5,1.5]) 
        du[0] = u[0]-u[0]*u[1]*a[0]
        du[1] = u[0]*u[1]-u[1]*a[1]
    
        
    @nb.cfunc(lsoda_sig)
    def f_param(t, u_, du, p): # pass in a as a paameter
        u = nb.carray(u_, (2,))
        du[0] = u[0]-u[0]*u[1]*p[0]
        du[1] = u[0]*u[1]-u[1]*p[1]
    
    funcptr_glob = f_global.address
    funcptr_local = f_local.address
    funcptr_param = f_param.address
    t_eval = np.linspace(0.0,20.0,201)
    np.random.seed(0)
    a = np.array([1.5,1.5])
    
    @nb.njit(parallel=True)
    def main(n, flag):
    #     a = np.array([1.5,1.5])
        u1 = np.empty((n,len(t_eval)), np.float64)
        u2 = np.empty((n,len(t_eval)), np.float64)
        for i in nb.prange(n):
            u0 = np.empty((2,), np.float64)
            u0[0] = np.random.uniform(4.5,5.5)
            u0[1] = np.random.uniform(0.7,0.9)
            if flag ==1: # global
                usol, success = lsoda(funcptr_glob, u0, t_eval, rtol = 1e-8, atol = 1e-8)
            if flag ==2: # local
                usol, success = lsoda(funcptr_local, u0, t_eval, rtol = 1e-8, atol = 1e-8)
            if flag ==3: # param
                usol, success = lsoda(funcptr_param, u0, t_eval, data = a, rtol = 1e-8, atol = 1e-8)
            u1[i] = usol[:,0]
            u2[i] = usol[:,1]
        return u1, u2
    
    @nb.njit(parallel=False)
    def main_series(n, flag): # same function as above but with parallel flag = False
    #     a = np.array([1.5,1.5])
    u1 = np.empty((n,len(t_eval)), np.float64)
    u2 = np.empty((n,len(t_eval)), np.float64)
    for i in nb.prange(n):
        u0 = np.empty((2,), np.float64)
        u0[0] = np.random.uniform(4.5,5.5)
        u0[1] = np.random.uniform(0.7,0.9)
        if flag ==1: # global
            usol, success = lsoda(funcptr_glob, u0, t_eval, rtol = 1e-8, atol = 1e-8)
        if flag ==2: # local
            usol, success = lsoda(funcptr_local, u0, t_eval, rtol = 1e-8, atol = 1e-8)
        if flag ==3: # param
            usol, success = lsoda(funcptr_param, u0, t_eval, data = a, rtol = 1e-8, atol = 1e-8)
        u1[i] = usol[:,0]
        u2[i] = usol[:,1]
    return u1, u2
    
    n = 100
    # calling first time for JIT compiling
    u1, u2 = main(n,1)
    u1, u2 = main(n,2)
    u1, u2 = main(n,3)
    
    u1, u2 = main_series(n,1)
    u1, u2 = main_series(n,1)
    u1, u2 = main_series(n,1)
    
    # Running code for large number 
    n = 10000
    a1 = time.time()
    u1, u2 = main(n,1) # global
    a2 = time.time()
    print(a2 - a1) # this is fast
    
    
    a1 = time.time()
    u1, u2 = main(n,2) # local
    a2 = time.time()
    print(a2 - a1) # this is slow
    
    a1 = time.time()
    u1, u2 = main(n,3) # param
    a2 = time.time()
    print(a2 - a1) # this is fast and almost identical performance as global
    
    a1 = time.time()
    u1, u2 = main_series(n,1) # global
    a2 = time.time()
    print(a2 - a1) # this is faster than local + parallel
    
    a1 = time.time()
    u1, u2 = main_series(n,2) # local
    a2 = time.time()
    print(a2 - a1) # this is slow
    
    a1 = time.time()
    u1, u2 = main_series(n,3) # param
    a2 = time.time()
    print(a2 - a1) # this is fast and almost identical performance as global
    
    opened by amar-iastate 2
  • Easier passing data to rhs

    Easier passing data to rhs

    Thanks for the package, some 400x faster than plain Python for the system I'm looking at. I just wanted to point out that Numba functions close over their local scope so passing arbitrary (constant?) data can be much easier than in your example. Mine looks like this:

    def make_lsoda_func(c,R,dstar,E,F,N):
        @cfunc(lsoda_sig)
        def rhs(t, x, du, p):
            codim3(t,x,c,R,dstar,E,F,N,du)
        return rhs
    
    opened by maedoc 2
  • Potential interpolant issue

    Potential interpolant issue

    Hi @Nicholaswogan,

    Great work on this repository, it looks like you managed a consistent and quite high speed-up compared to SciPy. I have been working with LSODA recently using solve_ivp, and I have encountered an issue relative to the interpolant (PR at SciPy). I am wondering if this implementation of LSODA also has a similar issue, as it looks to me that the root of the problem lies within the FORTRAN implementation. In particular, LSODA::intdy seems very similar to me to DINTDY in ODEPACK, which could well mean that the issue is present. Let me know if you can have a look.

    opened by Patol75 2
  • Addition of a Runge-Kutta version

    Addition of a Runge-Kutta version

    Good afternoon @Nicholaswogan.

    As discussed, here is a PR for the addition of the RK45 explicit solver. I still have a few things to add like the comparison to Scipy and the performance tests, but you can already have a look.

    Note that I only implemented Fehlberg RK45 here, but I wrote it in a sufficient general manner to implement any explicit RK described by its Butcher tableau.

    To-Do list

    • [x] It seems there is a bug with machine-epsilon requested absolute tolerance (atol = 1e-30): the timesteps get infinitesimally smaller and the solver never finishes.
    opened by tbridel 1
  • Adding other Runge-Kutta time integrators

    Adding other Runge-Kutta time integrators

    Hye again !

    Could you please share the steps one would have to take to add a new Runge-Kutta integrator to the project ?

    I'm happy to implement it push a PR at some point, but I wanted to know if you had maybe a process that would make it easier to add a new Butcher-tableau based RK integrator ?

    Thanks ! T. Bridel-Bertomeu

    opened by tbridel 2
  • Adaptive timestepping and hijacking the

    Adaptive timestepping and hijacking the "step" method

    Hi again @Nicholaswogan !

    I was wondering if you ever thought about the adaptive timestepping capability that exists in Scipy LSODA ? Is anything like this possible with your implementation ?

    Thanks ! Best, T. Bridel-Bertomeu

    opened by tbridel 12
  • Installation via pip produces error when importing numbalsoda

    Installation via pip produces error when importing numbalsoda

    Hi Nick, thank you very much for building this Python package, it definitely appears very promising!

    I installed the latest version of numbalsoda (0.3.4) via pip and the installation was successful. I am using Python 3.9.7 and conda version 22.9.0.

    However, when trying to import it, I get the following error:

    'The procedure entry point ZSt28​__throw​_bad_array_new_lengthv could not be located in the dynamic link library C:\Users...\numbalsoda\liblsoda.dll'

    Installing numbalsoda via conda (conda install -c conda-forge numbalsoda) solves the issue and the package is successfully imported into the Python script. Conda installs version 0.3.2, and thus I was wondering if the difference in the versions may play a role in this issue.

    Thank you very much!

    opened by kf120 0
  • Backward in time using numbalsoda

    Backward in time using numbalsoda

    Hi

    I am using numblsoda to solve couples of first order differential equations. I need to solve those equations backward in time for some particular situations. I could not figure out how I can do that. I tried to use decreasing time step but it does not work as it works for solve_ivp and odeint.

    for instance in this code how can I go backward in time?

    from numbalsoda import lsoda_sig, lsoda, dop853
    from numba import njit, cfunc
    import numpy as np
    
    @cfunc(lsoda_sig)
    def rhs(t, u, du, p):
        du[0] = u[0]-u[0]*u[1]
        du[1] = u[0]*u[1]-u[1]*p[0]
    
    funcptr = rhs.address # address to ODE function
    u0 = np.array([5.,0.8]) # Initial conditions
    data = np.array([1.0]) # data you want to pass to rhs (data == p in the rhs).
    t_eval = np.arrange(0.0,50.0,0.1) # times to evaluate solution
    
    
    usol, success = lsoda(funcptr, u0, t_eval, data = data)
    

    I tried t_eval = np.arrange(50, 0, -0.1) but it does not work. It would be great if someone could help me to fix this issue.

    opened by meysam-motaharfar 1
  • Make NumPy types for u0 and usol configurable

    Make NumPy types for u0 and usol configurable

    First of all, awesome code! I got a speedup of an order of magnitude basically for free!

    But, as far as I can tell, u0 and usol are required to be of type np.float64. However, often times np.float32 is more than enough.

    Maybe I just don't see the option to change it, but if it doesn't exist, it'd be nice if we could change it.

    Thank you!

    opened by juhannc 1
Releases(v0.3.5)
Owner
Nick Wogan
Planetary Scientist, PhD Student, developing models of atmospheric evolution.
Nick Wogan
Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

CoSMo.pytorch Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback, Seungmin Lee*, Dongwan Kim*, Bohyung

Seung Min Lee 54 Dec 08, 2022
Unofficial PyTorch Implementation for HifiFace (https://arxiv.org/abs/2106.09965)

HifiFace β€” Unofficial Pytorch Implementation Image source: HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping (figure 1, pg. 1)

MINDs Lab 218 Jan 04, 2023
Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

TS-CAM: Token Semantic Coupled Attention Map for Weakly SupervisedObject Localization This is the official implementaion of paper TS-CAM: Token Semant

vasgaowei 112 Jan 02, 2023
Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds BjΓΆrn Michele1), Alexandre Boulch1), Gilles Puy1), Maxime Bucher1) and Rena

valeo.ai 15 Dec 22, 2022
PyTorch Implementation of Region Similarity Representation Learning (ReSim)

ReSim This repository provides the PyTorch implementation of Region Similarity Representation Learning (ReSim) described in this paper: @Article{xiao2

Tete Xiao 74 Jan 03, 2023
BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition

Rui Qian 17 Dec 12, 2022
This is a TensorFlow implementation for C2-Rec

This is a TensorFlow implementation for C2-Rec We refer to the repo SASRec. Requirements requirement.txt Datasets This repo includes Amazon Beauty dat

7 Nov 14, 2022
Demystifying How Self-Supervised Features Improve Training from Noisy Labels

Demystifying How Self-Supervised Features Improve Training from Noisy Labels This code is a PyTorch implementation of the paper "[Demystifying How Sel

<a href=[email protected]"> 4 Oct 14, 2022
Code of the paper "Part Detector Discovery in Deep Convolutional Neural Networks" by Marcel Simon, Erik Rodner and Joachim Denzler

Part Detector Discovery This is the code used in our paper "Part Detector Discovery in Deep Convolutional Neural Networks" by Marcel Simon, Erik Rodne

Computer Vision Group Jena 17 Feb 22, 2022
The repository offers the official implementation of our paper in PyTorch.

Cloth Interactive Transformer (CIT) Cloth Interactive Transformer for Virtual Try-On Bin Ren1, Hao Tang1, Fanyang Meng2, Runwei Ding3, Ling Shao4, Phi

Bingoren 49 Dec 01, 2022
Parallel Latent Tree-Induction for Faster Sequence Encoding

FastTrees This repository contains the experimental code supporting the FastTrees paper by Bill Pung. Software Requirements Python 3.6, NLTK and PyTor

Bill Pung 4 Mar 29, 2022
PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition The unofficial code of CDistNet. Now, we ha

25 Jul 20, 2022
Coded illumination for improved lensless imaging

CodedCam Coded Illumination for Improved Lensless Imaging Paper | Supplementary results | Data and Code are available. Coded illumination for improved

Computational Sensing and Information Processing Lab 1 Nov 29, 2021
A curated list of awesome Deep Learning tutorials, projects and communities.

Awesome Deep Learning Table of Contents Books Courses Videos and Lectures Papers Tutorials Researchers Websites Datasets Conferences Frameworks Tools

Christos 20k Jan 05, 2023
Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight)

[NeurIPS 2021 Spotlight] HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning [Paper] This is Official PyTorch implementatio

42 Nov 01, 2022
ECLARE: Extreme Classification with Label Graph Correlations

ECLARE ECLARE: Extreme Classification with Label Graph Correlations @InProceedings{Mittal21b, author = "Mittal, A. and Sachdeva, N. and Agrawal

Extreme Classification 35 Nov 06, 2022
Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of images as "pixels"

picinpics Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of

RodrigoCMoraes 1 Oct 24, 2021
A pre-trained model with multi-exit transformer architecture.

ElasticBERT This repository contains finetuning code and checkpoints for ElasticBERT. Towards Efficient NLP: A Standard Evaluation and A Strong Baseli

fastNLP 48 Dec 14, 2022
Authors implementation of LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant self-at

35 Oct 18, 2022
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022