Fitting thermodynamic models with pycalphad

Last update: Sep 12, 2022

Overview

ESPEI

ESPEI, or Extensible Self-optimizing Phase Equilibria Infrastructure, is a tool for thermodynamic database development within the CALPHAD method. It uses pycalphad for calculating Gibbs free energies of thermodynamic models.

Read the documentation at espei.org.

Installation Anaconda (recommended)

ESPEI does not require any special compiler, but several dependencies do. Therefore it is suggested to install ESPEI from conda-forge.

conda install -c conda-forge espei

What is ESPEI?

ESPEI parameterizes CALPHAD models with enthalpy, entropy, and heat capacity data using the corrected Akiake Information Criterion (AICc). This parameter generation step augments the CALPHAD modeler by providing tools for data-driven model selection, rather than relying on a modeler's intuition alone.
ESPEI optimizes CALPHAD model parameters to thermochemical and phase boundary data and quantifies the uncertainty of the model parameters using Markov Chain Monte Carlo (MCMC). This is similar to the PARROT module of Thermo-Calc, but goes beyond by adjusting all parameters simultaneously and evaluating parameter uncertainty.

Details on the implementation of ESPEI can be found in the publication: B. Bocklund et al., MRS Communications 9(2) (2019) 1–10. doi:10.1557/mrc.2019.59.

What ESPEI can do?

ESPEI can be used to generate model parameters for CALPHAD models of the Gibbs energy that follow the temperature-dependent polynomial by Dinsdale (CALPHAD 15(4) 1991 317-425) within the compound energy formalism (CEF) for endmembers and Redlich-Kister-Mugganu excess mixing parameters in unary, binary and ternary systems.

All thermodynamic quantities are computed by pycalphad. The MCMC-based Bayesian parameter estimation can optimize data for any model that is supported by pycalphad, including models beyond the endmember Gibbs energies Redlich-Kister-Mugganiu excess terms, such as parameters in the ionic liquid model, magnetic, or two-state models. Performing Bayesian parameter estimation for arbitrary multicomponent thermodynamic data is supported.

Goals

Offer a free and open-source tool for users to develop multicomponent databases with quantified uncertainty
Enable development of CALPHAD-type models for Gibbs energy, thermodynamic or kinetic properties
Provide a platform to build and apply novel model selection, optimization, and uncertainty quantification methods

The implementation for ESPEI involves first performing parameter generation by calculating parameters in thermodynamic models that are linearly described by non-equilibrium thermochemical data. Then Markov Chain Monte Carlo (MCMC) is used to optimize the candidate models from the parameter generation to phase boundary data.

Cu-Mg phase diagram from a database created with and optimized by ESPEI. See the Cu-Mg Example.

History

The ESPEI package is based on a fork of pycalphad-fitting. The name and idea of ESPEI are originally based off of Shang, Wang, and Liu, ESPEI: Extensible, Self-optimizing Phase Equilibrium Infrastructure for Magnesium Alloys Magnes. Technol. 2010 617-622 (2010).

Implementation details for ESPEI have been described in the following publications:

1. Bocklund et al., MRS Communications 9(2) (2019) 1–10. doi:10.1557/mrc.2019.59
1. Otis et al., JOM 69 (2017) doi:10.1007/s11837-017-2318-6
Richard Otis's thesis

Getting Help

For help on installing and using ESPEI, please join the PhasesResearchLab/ESPEI Gitter room.

Bugs and software issues should be reported on GitHub.

License

ESPEI is MIT licensed.

The MIT License (MIT)

Copyright (c) 2015-2018 Richard Otis
Copyright (c) 2017-2018 Brandon Bocklund
Copyright (c) 2018-2019 Materials Genome Foundation

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Citing ESPEI

If you use ESPEI for work presented in a publication, we ask that you cite the following publication:

Bocklund, R. Otis, A. Egorov, A. Obaied, I. Roslyakova, Z.-K. Liu, ESPEI for efficient thermodynamic database development, modification, and uncertainty quantification: application to Cu–Mg, MRS Commun. (2019) 1–10. doi:10.1557/mrc.2019.59.

@article{Bocklund2019ESPEI,
         archivePrefix = {arXiv},
         arxivId = {1902.01269},
         author = {Bocklund, Brandon and Otis, Richard and Egorov, Aleksei and Obaied, Abdulmonem and Roslyakova, Irina and Liu, Zi-Kui},
         doi = {10.1557/mrc.2019.59},
         eprint = {1902.01269},
         issn = {2159-6859},
         journal = {MRS Communications},
         month = {jun},
         pages = {1--10},
         title = {{ESPEI for efficient thermodynamic database development, modification, and uncertainty quantification: application to Cu–Mg}},
         year = {2019}
}

Comments

Compute metastable/unstable single phase driving forces in ZPF error
Thanks to Tobias Spitaler for suggesting this and to @richardotis for brainstorming this solution concept.

This PR introduces two new functions in ZPF error, _solve_sitefracs_composition and _sample_solution_constitution. Their purpose is to facilitate computing metastable or unstable single phase driving forces when a phase has a miscibility gap. This should improve the convergence for any phase that has a stable or metastable miscibility gap.

Rationale

ESPEI currently computes the "single-phase hyperplane" at a vertex by performing an equilibrium calculate at a black point and then subtracting that from the target hyperplane energy at that composition. As illustrated in the figure Tobias constructed (below), this is problematic for phases with a miscibility gap because a "single-phase" equilibrium calculation in pycalphad will always compute the global minimum energy and give two composition sets.

What ESPEI should do is what Tobias illustrates by the orange x and the green driving force line. This solution ensures that minimizing the driving force will force the Gibbs energy curve to match the energy of the black points on the multi-phase target hyperplane.

Historically, we didn't implement this because one would like to use equilibrium to minimize the internal degeres of freedom, but pycalphad always computes the global minimum energy, so it was not possible to do via equilibrium. More recently, ESPEI had introduced the idea of approximate_equilibrium, which uses starting_point to more quickly determine a minimum energy solution from a discrete point smapling grid. The approximate_equilibrium method we use still has the same problem as pycalphad's equilibrium because starting_point will still give the global minimum solution for the discrete sampling.

Solution

In an ideal world, pycalphad should be able to turn off global minimization (automatically introducing new composition sets) and enable a condition to be set for the composition of a phase, i.e. X(BCC,B). In practice, being able to turn off global minimum and provide a valid starting point for only one composition set that has a global composition condition would simulate a phase composition condition. Unfortunately, neither turning off global minimization nor phase composition conditions are currently implemented. So we need to do a workaround.

The two functions introduced here consider each single phase composition at a tie-vertex and construct a point grid that only contains points which satisfy the prescribed overall composition (and the internal phase constraints). This can be used in either approximate or exact equilibrium modes to find lowest energy starting point and then to pass that equilibrium with the constrained point grid so the global minimization step has no new composition sets to introduce (i.e. it cannot detect a miscibility gap).

For perfomance, we pre-compute the grid of points for every phase composition in the ZPF datasets and re-use them to compute the grid, starting point and equilibrium at every parameter iteration (note that this would be invalid if a parameter changes the number of moles, like varying coordination number in the MQMQA).

To summarize the impact:

This method will be entirely backwards compatible for phases without a miscibility gap.

For cases where a miscibility gap is present in the parameters, but a single phase is prescribed, there will be a driving force to eliminate the miscibility gap, so the single phase compositions are more meaningful too. This is significant because you can prescribe single phase regions in ZPF datasets and it will enforce that no miscibility gap occurs, which is not true today.

For phase compositions inside a miscibilty gap, the Gibbs energy curve will match the multi-phase global minimum hyperplane at the phase compositions (at convergence).
opened by bocklund 20
ERROR occurred using the new development version

Dear Administrator, There were some tests that failed when I try to run pytest after install the new development version(2021/4/21, Beijing time). Meanwhile, there is some error occurred when I run some example cases that successfully run using other versions before. errorlog.txt pytestfail.txt condalist.txt

opened by duxiaoxian 12

Error releasing un-acquired lock in dask

Was distributed (1.18.0) when this error occurred. Changed to distributed (1.16.3).

  File "/Applications/anaconda/envs/my_pycalphad/bin/espei", line 11, in <module>
    sys.exit(main())
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/run_espei.py", line 135, in main
    mcmc_steps=args.mcmc_steps, save_interval=args.save_interval)
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/paramselect.py", line 754, in fit
    for i, result in enumerate(sampler.sample(walkers, iterations=mcmc_steps)):
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 259, in sample
    lnprob[S0])
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 332, in _propose_stretch
    newlnprob, blob = self._get_lnprob(q)
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 382, in _get_lnprob
    results = list(M(self.lnprobfn, [p[i] for i in range(len(p))]))
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/utils.py", line 39, in map
    result = [x.result() for x in result]
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/utils.py", line 39, in <listcomp>
    result = [x.result() for x in result]
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/distributed/client.py", line 155, in result
    six.reraise(*result)
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/distributed/protocol/pickle.py", line 59, in loads
    return pickle.loads(x)
RuntimeError: cannot release un-acquired lock```

bug

opened by ghost 10

dask workers can sometimes die without warning
I haven't been able to reproduce it consistently, but dark workers sometimes die with the dask scheduler.

To debug this, I turned on debugging output by scheduler = LocalCluster(n_workers=cores, threads_per_worker=1, processes=True, silence_logs=verbosity[output_settings['verbosity']]).

I am still waiting for that job to have workers die to see the output, but for now as iterations in emcee complete the results are processed in Python (it is known that this is happening because of the progress bar output). During this time, the LocalCluster debugging gives output

distributed.core - WARNING - Event loop was unresponsive for 1.69s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.

Usually I get two similar messages in a row.

As another possibility, the most recent time I was able to reproduce this was when I had two instances of ESPEI running at the same time. I wouldn't think that the different client instances would interact, but maybe it should be investigated.
opened by bocklund 6
Issues reproducing Cu-Mg example

I had several issues running the Cu-Mg example from the ESPEI website. I installed ESPEI using the conda command, and took the Cu-Mg data directory from the ESPEI-datasets repository.

I first tried reproducing the diagram from the section titled, First-principles phase diagram The code successfully ran, but the returned phase diagram didn't match the example well:

I then tried reproducing the results in the MCMC optimization section. I wasn't able to successfully perform the MCMC optimization. The code returned numerous errors over the course of several minutes and eventually hung with no further output.

This file contains the full python output when I ran the optimization: espei_mcmc_error.txt

Here is my python version and installed packages/versions: python_info.txt

opened by npaulson 6
The latest version of espei = 0.7.2 get an error when plot

I have recently used the latest version of espei = 0.7.2 and I always get an error, but I used espei = 0.6 and it works fine.

My current computer can't use espei = 0.6 again, so I don't know which version to use, I don't know what went wrong. I always get MPI errors when I use espei = 0.6

AG_CU_1214.zip

opened by duxiaoxian 5

Run ESPEI via input files, rather than command line arguments

A first draft and feedback was written in this gist

The current iteration is:

Header area.
Include any metadata above the `---`.
---
# core run settings
run_type: full # choose full | dft | mcmc
phase_models: input.json
datasets: input-datasets # path to datasets. Defaults to current directory.
scheduler: dask # can be dask | MPIPool

# control output
verbosity: 0 # integer verbosity level 0 | 1 | 2, where 2 is most verbose.
output_tdb: out.tdb
tracefile: chain.npy # name of the file containing the mcmc chain array
probfile: lnprob.npy # name of the file containing the mcmc ln probability array

# the following only take effect for full or mcmc runs
mcmc:
  mcmc_steps: 2000
  mcmc_save_interval: 100

  # the following take effect for only mcmc runs
  input_tdb: null # TDB file used to start the mcmc run
  restart_chain: null # restart the mcmc fitting from a previous calculation

This issue will focus on the development of a first generation input file structure and spec, and also as a place to brainstorm options that should be user-facing.

opened by bocklund 5

Limit the degrees of freedom for non-active phases in MCMC to prevent them from diverging?

Phases that do not have phase equilibria data should have their parameters fixed before the MCMC run.

A particular phase in an ESPEI run can have single phase DFT data and no phase equilibria. This means that the parameters that were calculated in the single phase fitting have no effect on the error function that is used in the MCMC run.

When parameters have no effect on the error function, they diverge when used in emcee because the ensemble sampler scales them up to infinity in an attempt to force that parameter to affect the error function.
bug enhancement

opened by bocklund 5

Error when running Cu-Mg example

Hello, I am trying to run ESPEI for the first time.

I created a conda env and installed ESPEI using conda. I downloaded json and yaml files as well as the contents of the Cu-Mg folder in ESPEI-datasets, renamed it to input-data. After running espei --input espei-in.yaml, I get the errors below. Could you please let me know if I am doing anything wrong?

Thanks!

Traceback (most recent call last):
  File "/Users/latmarat/miniforge3/envs/espenv/bin/espei", line 10, in <module>
    sys.exit(main())
  File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/espei_script.py", line 307, in main
    run_espei(input_settings)
  File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/espei_script.py", line 177, in run_espei
    dbf = generate_parameters(phase_models, datasets, refdata, excess_model,
  File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/paramselect.py", line 517, in generate_parameters
    aliases = extract_aliases(phase_models)
  File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/utils.py", line 370, in extract_aliases
    aliases = {phase_name: phase_name for phase_name in phase_models["phases"].keys()}
AttributeError: 'list' object has no attribute 'keys'

opened by latmarat 4

AttributeError: 'NoneType' object has no attribute 'values'

Dear Administrator, An 'AttributeError' occurred when I run 'espei --input espei-in-2.yaml' using the latest development version of ESPEI. Would you mind help me to check my dataset? Thanks. errorprint-log.txt verbosity-log.txt CO-CU-20201104.zip

f:\users\zhang\pycalphad\pycalphad\codegen\callables.py:97: UserWarning: State variables in build_callables are not {N, P, T}, but {T, P}. This can lead to incorrectly calculated values if the state variables used to call the generated functions do not match the state variables used to create them. State variables can be added with the additional_statevars argument. "additional_statevars argument.".format(state_variables)) Traceback (most recent call last): File "F:\Users\zhang\Anaconda32020\envs\espei2020test\Scripts\espei-script.py", line 33, in sys.exit(load_entry_point('espei', 'console_scripts', 'espei')()) File "f:\users\zhang\espei\espei\espei_script.py", line 311, in main run_espei(input_settings) File "f:\users\zhang\espei\espei\espei_script.py", line 260, in run_espei approximate_equilibrium=approximate_equilibrium, File "f:\users\zhang\espei\espei\optimizers\opt_base.py", line 36, in fit node = self.fit(symbols, datasets, *args, **kwargs) File "f:\users\zhang\espei\espei\optimizers\opt_mcmc.py", line 238, in fit self.predict(initial_guess, **ctx) File "f:\users\zhang\espei\espei\optimizers\opt_mcmc.py", line 289, in predict multi_phase_error = calculate_zpf_error(parameters=np.array(params), **zpf_kwargs) File "f:\users\zhang\espei\espei\error_functions\zpf_error.py", line 315, in calculate_zpf_error target_hyperplane = estimate_hyperplane(phase_region, parameters, approximate_equilibrium=approximate_equilibrium) File "f:\users\zhang\espei\espei\error_functions\zpf_error.py", line 186, in estimate_hyperplane grid = calculate(dbf, species, phases, str_statevar_dict, models, phase_records, pdens=500, fake_points=True) File "f:\users\zhang\espei\espei\shadow_functions.py", line 55, in calculate largest_energy=float(1e10), fake_points=fp) File "f:\users\zhang\pycalphad\pycalphad\core\calculate.py", line 190, in _compute_phase_values param_symbols, parameter_array = extract_parameters(parameters) File "f:\users\zhang\pycalphad\pycalphad\core\utils.py", line 361, in extract_parameters parameter_array_lengths = set(np.atleast_1d(val).size for val in parameters.values()) AttributeError: 'NoneType' object has no attribute 'values'

opened by duxiaoxian 4
Migrate pycalphad refdata to ESPEI
Tracking from https://github.com/pycalphad/pycalphad/issues/120

Assume that SGTE91Stable is correct per https://github.com/pycalphad/pycalphad/issues/120. Then we must

[x] Remove the metastable phases not present in the SGTE91 original paper

[ ] Check that remaining phases have correct descriptions
opened by bocklund 4
MCMC Initialized chains should include initial point

During the initialization of the chains for the MCMC optimizer, a Gaussian distribution about an initial point is taken. https://github.com/PhasesResearchLab/ESPEI/blob/7c797191d4c3178fe4a22275bbaee9c2977786ad/espei/optimizers/opt_mcmc.py#L98

I would suggest including the initial point in that set of initial chains. If everything is set up correctly, this won't matter, but for cases where the standard deviation is too high while the initial guess is quite good, the current behavior will lead to a lot of bad starting points. Modifying the initial set to include the initial guess point should ensure that at least this state (or acceptable permutations of it) will survive the MCMC run. What do you think?

opened by toastedcrumpets 0

formatted_parameter broken by SymEngine

Switching the symbolic backend to SymEngine broke espei.utils.formatted_parameter. Here's a test to validate (run from the tests directory for the testing_data module to be importable).

# espei/tests/test_utils.py

from pycalphad import Database
from espei.utils import formatted_parameter, database_symbols_to_fit
from .testing_data import CU_MG_TDB
def test_cu_mg_parameters_can_be_formatted_to_strings():
    """Formating parameters should work for common variables parameters"""
    dbf = Database(CU_MG_TDB)
    for sym in database_symbols_to_fit(dbf):
        assert isinstance(formatted_parameter(dbf, sym), str), f"Formatted parameter for symbol {sym} (value = {dbf.symbols[sym]}) in database not a string"

Running this gives an error:

Traceback (most recent call last):
  File "/Users/bocklund1/src/calphad/espei/tests/dummy.py", line 11, in <module>
    test_cu_mg_parameters_can_be_formatted_to_strings()
  File "/Users/bocklund1/src/calphad/espei/tests/dummy.py", line 9, in test_cu_mg_parameters_can_be_formatted_to_strings
    assert isinstance(formatted_parameter(dbf, sym), str), f"Formatted parameter for symbol {sym} (value = {dbf.symbols[sym]}) in database not a string"
  File "/Users/bocklund1/src/calphad/espei/espei/utils.py", line 295, in formatted_parameter
    term = parameter_term(result['parameter'], symbol)
  File "/Users/bocklund1/src/calphad/espei/espei/utils.py", line 218, in parameter_term
    coeff, root = term_coeff.as_coeff_mul(symbol)
AttributeError: 'symengine.lib.symengine_wrapper.Symbol' object has no attribute 'as_coeff_mul'

I think the breakage might be because espei.utils.parameter_term isn't correctly picking up the first condition, since for the case of symbol being a symengine.lib.symengine_wrapper.Symbol, I think expression == symbol should evaluate to true, but evidently (via the traceback) it is evaluating to false.

opened by bocklund 0

Memory leak when running MCMC in parallel
Due to a known memory leak when instantiating subclasses of SymEngine (one of our upstream dependencies) Symbol objects (see https://github.com/symengine/symengine.py/issues/379), running ESPEI with parallelization will cause memory to grow in each worker.

Only running in parallel will trigger significant memory growth, because running in parallel uses the pickle library to serialize and deserialize symbol objects and create new objects that can't be freed. When running without parallelization (mcmc.scheduler: null), new symbols are not created.

Until https://github.com/symengine/symengine.py/issues/379 is fixed, some mitigation strategies to avoid running out of memory are:

Run ESPEI without parallelization by setting scheduler: null

(Under consideration to implement): when parallelization is active, use an option to restart the workers every N iterations.

(Under consideration to implement): remove Model objects from the keyword arguments of ESPEI's likelihood functions. Model objects contribute a lot of symbol instances in the form of v.SiteFraction objects. We should be able to get away with only using PhaseRecord objects, but there are a few places Model.constituents to be able to infer the sublattice model and internal degrees of freedom that would need to be rewritten.
opened by bocklund 1
Unable to use activity data in binary Fe-C with Graphite as reference state
Hi,

We are currently trying to use activity data for Fe-C. Lobo1976 measured the activity of C in alpha-iron relative to Graphite as the standard state, but get erroneous results. (Lobo, Joseph A., and Gordon H. Geiger. "Thermodynamics and solubility of carbon in ferrite and ferritic Fe-Mo alloys." Metallurgical Transactions A 7.8 (1976): 1347-1357.)

I have added the input file below. With this input file, we get chemical potential difference: [nan] (verbosity 3 output). Is the input file correct or are we missing something? I have had a look at the value of ref_result within the activity_error.py and this does give only nan results for the specified reference state. Graphite only has C as a component. An equilibrium calculation of Graphite specifying x.V('C') gives an error as Number of dependent components different from one. Can this cause an error here as well? Used versions: espei: 0.8.6 and pycalphad 0.9.2. I have added a zip-file with the TDB file and espei input files which reproduces this behaviour.

Thank you for your help, Tobias

{ "components": ["FE", "C", "VA"], "phases": ["BCC_A2", "GRAPHITE"], "weight": 1000, "reference_state": { "phases": ["GRAPHITE"], "conditions": { "P": 101325, "T": 1056.15, "X_C": 1 } }, "conditions": { "P": 101325, "T": 1056.15, "X_C": [0.00013017] }, "output": "ACR_C", "values": [[[0.087]] ], "reference": "Lobo1976_1056K", "meta_data": { "DOI": "10.1007/BF02658820", "literature reference": "Thermodynamics and Solubility of Carbon in Ferrite and Ferritic Fe-Mo Alloys", "table/figure": "table 1", "measured data": "C-activity in Alpha-Iron", "experimental details": "not available", "weight": "default" } }

minimal_example.zip
opened by tobiasspt 1
ENH: Allow multiple datasets directories to be specified in YAML input
Sometimes it is useful to load datasets from different filesystem locations, for example if one folder contains hand-curated data and another contains automatically generated data.

In code, it would be pretty simple to handle this. Instead of

from espei.datasets import load_datasets, recursive_glob directory = '/path/to/directory/' load_datasets(recursive_glob(directory))

we could do

from itertools import chain from espei.datasets import load_datasets, recursive_glob directories = ['/path/to/directory_1/', '/path/to/directory_2/'] load_datasets(chain(*map(recursive_glob, directories)))
opened by bocklund 1

Releases(0.8.9)

0.8.9(Aug 5, 2022)

Source code(tar.gz)
Source code(zip)
0.8.8(May 25, 2022)

Source code(tar.gz)
Source code(zip)
0.8.7(Feb 22, 2022)

Source code(tar.gz)
Source code(zip)
0.8.6(Jan 25, 2022)

Source code(tar.gz)
Source code(zip)
0.8.5(Aug 9, 2021)

Source code(tar.gz)
Source code(zip)
0.8.3(May 8, 2021)

https://espei.org/en/latest/CHANGES.html
Source code(tar.gz)
Source code(zip)

Owner

Phases Research Lab

Research group lead by Dr. Zi-Kui Liu at The Pennsylvania State University.

GitHub Repository http://espei.org

Mining the Stack Overflow Developer Survey

Mining the Stack Overflow Developer Survey A prototype data mining application to compare the accuracy of decision tree and random forest regression m

1 Nov 16, 2021

Describing statistical models in Python using symbolic formulas

Patsy is a Python library for describing statistical models (especially linear models, or models that have a linear component) and building design mat

866 Dec 16, 2022

A columnar data container that can be compressed.

Unmaintained Package Notice Unfortunately, and due to lack of resources, the Blosc Development Team is unable to maintain this package anymore. During

944 Dec 09, 2022

Zipline, a Pythonic Algorithmic Trading Library

Zipline is a Pythonic algorithmic trading library. It is an event-driven system for backtesting. Zipline is currently used in production as the backte

15.7k Jan 07, 2023

A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

9.9k Dec 31, 2022

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

Description Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Ti

4.1k Jan 09, 2023

TheMachineScraper 🐱‍👤 is an Information Grabber built for Machine Analysis

TheMachineScraper 🐱‍👤 is a tool made purely for analysing machine data for any reason.

5 Dec 01, 2022

Used for data processing in machine learning, and help us to construct ML model more easily from scratch

Used for data processing in machine learning, and help us to construct ML model more easily from scratch. Can be used in linear model, logistic regression model, and decision tree.

0 Jul 05, 2022

:truck: Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark

To launch a live notebook server to test optimus using binder or Colab, click on one of the following badges: Optimus is the missing framework to prof

1.3k Dec 30, 2022

small package with utility functions for analyzing (fly) calcium imaging data

fly2p Tools for analyzing two-photon (2p) imaging data collected with Vidrio Scanimage software and micromanger. Loading scanimage data relies on scan

3 Dec 14, 2022

A Python Tools to imaging the shallow seismic structure

ShallowSeismicImaging Tools to imaging the shallow seismic structure, above 10 km, based on the ZH ratio measured from the ambient seismic noise, and

9 Aug 09, 2022

A distributed block-based data storage and compute engine

Nebula is an extremely-fast end-to-end interactive big data analytics solution. Nebula is designed as a high-performance columnar data storage and tabular OLAP engine.

131 Dec 26, 2022

Learn machine learning the fun way, with Oracle and RedBull Racing

Red Bull Racing Analytics Hands-On Labs Introduction Are you interested in learning machine learning (ML)? How about doing this in the context of the

55 Oct 24, 2022

Vaex library for Big Data Analytics of an Airline dataset

Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics

1 Feb 13, 2022

Automated Exploration Data Analysis on a financial dataset

Automated EDA on financial dataset Just a simple way to get automated Exploration Data Analysis from financial dataset (OHLCV) using Streamlit and ta.

28 Nov 27, 2022

A Numba-based two-point correlation function calculator using a grid decomposition

A Numba-based two-point correlation function (2PCF) calculator using a grid decomposition. Like Corrfunc, but written in Numba, with simplicity and hackability in mind.

3 Aug 24, 2022

PyPSA: Python for Power System Analysis

1 Python for Power System Analysis Contents 1 Python for Power System Analysis 1.1 About 1.2 Documentation 1.3 Functionality 1.4 Example scripts as Ju

758 Dec 30, 2022

Aggregating gridded data (xarray) to polygons

A package to aggregate gridded data in xarray to polygons in geopandas using area-weighting from the relative area overlaps between pixels and polygons. Check out the binder link above for a sample c

42 Nov 09, 2022

This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

📈 Statistical Quality Control 📉 This repo contains a simple but effective tool made using python which can be used for quality control in statistica

8 Oct 18, 2022

CRISP: Critical Path Analysis of Microservice Traces

CRISP: Critical Path Analysis of Microservice Traces This repo contains code to compute and present critical path summary from Jaeger microservice tra

110 Jan 06, 2023