Fitting thermodynamic models with pycalphad

Overview

ESPEI

ESPEI, or Extensible Self-optimizing Phase Equilibria Infrastructure, is a tool for thermodynamic database development within the CALPHAD method. It uses pycalphad for calculating Gibbs free energies of thermodynamic models.

Read the documentation at espei.org.

Installation Anaconda (recommended)

ESPEI does not require any special compiler, but several dependencies do. Therefore it is suggested to install ESPEI from conda-forge.

conda install -c conda-forge espei

What is ESPEI?

  1. ESPEI parameterizes CALPHAD models with enthalpy, entropy, and heat capacity data using the corrected Akiake Information Criterion (AICc). This parameter generation step augments the CALPHAD modeler by providing tools for data-driven model selection, rather than relying on a modeler's intuition alone.
  2. ESPEI optimizes CALPHAD model parameters to thermochemical and phase boundary data and quantifies the uncertainty of the model parameters using Markov Chain Monte Carlo (MCMC). This is similar to the PARROT module of Thermo-Calc, but goes beyond by adjusting all parameters simultaneously and evaluating parameter uncertainty.

Details on the implementation of ESPEI can be found in the publication: B. Bocklund et al., MRS Communications 9(2) (2019) 1–10. doi:10.1557/mrc.2019.59.

What ESPEI can do?

ESPEI can be used to generate model parameters for CALPHAD models of the Gibbs energy that follow the temperature-dependent polynomial by Dinsdale (CALPHAD 15(4) 1991 317-425) within the compound energy formalism (CEF) for endmembers and Redlich-Kister-Mugganu excess mixing parameters in unary, binary and ternary systems.

All thermodynamic quantities are computed by pycalphad. The MCMC-based Bayesian parameter estimation can optimize data for any model that is supported by pycalphad, including models beyond the endmember Gibbs energies Redlich-Kister-Mugganiu excess terms, such as parameters in the ionic liquid model, magnetic, or two-state models. Performing Bayesian parameter estimation for arbitrary multicomponent thermodynamic data is supported.

Goals

  1. Offer a free and open-source tool for users to develop multicomponent databases with quantified uncertainty
  2. Enable development of CALPHAD-type models for Gibbs energy, thermodynamic or kinetic properties
  3. Provide a platform to build and apply novel model selection, optimization, and uncertainty quantification methods

The implementation for ESPEI involves first performing parameter generation by calculating parameters in thermodynamic models that are linearly described by non-equilibrium thermochemical data. Then Markov Chain Monte Carlo (MCMC) is used to optimize the candidate models from the parameter generation to phase boundary data.

Cu-Mg phase diagram

Cu-Mg phase diagram from a database created with and optimized by ESPEI. See the Cu-Mg Example.

History

The ESPEI package is based on a fork of pycalphad-fitting. The name and idea of ESPEI are originally based off of Shang, Wang, and Liu, ESPEI: Extensible, Self-optimizing Phase Equilibrium Infrastructure for Magnesium Alloys Magnes. Technol. 2010 617-622 (2010).

Implementation details for ESPEI have been described in the following publications:

Getting Help

For help on installing and using ESPEI, please join the PhasesResearchLab/ESPEI Gitter room.

Bugs and software issues should be reported on GitHub.

License

ESPEI is MIT licensed.

The MIT License (MIT)

Copyright (c) 2015-2018 Richard Otis
Copyright (c) 2017-2018 Brandon Bocklund
Copyright (c) 2018-2019 Materials Genome Foundation

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Citing ESPEI

If you use ESPEI for work presented in a publication, we ask that you cite the following publication:

  1. Bocklund, R. Otis, A. Egorov, A. Obaied, I. Roslyakova, Z.-K. Liu, ESPEI for efficient thermodynamic database development, modification, and uncertainty quantification: application to Cu–Mg, MRS Commun. (2019) 1–10. doi:10.1557/mrc.2019.59.
@article{Bocklund2019ESPEI,
         archivePrefix = {arXiv},
         arxivId = {1902.01269},
         author = {Bocklund, Brandon and Otis, Richard and Egorov, Aleksei and Obaied, Abdulmonem and Roslyakova, Irina and Liu, Zi-Kui},
         doi = {10.1557/mrc.2019.59},
         eprint = {1902.01269},
         issn = {2159-6859},
         journal = {MRS Communications},
         month = {jun},
         pages = {1--10},
         title = {{ESPEI for efficient thermodynamic database development, modification, and uncertainty quantification: application to Cu–Mg}},
         year = {2019}
}
Comments
  • Compute metastable/unstable single phase driving forces in ZPF error

    Compute metastable/unstable single phase driving forces in ZPF error

    Thanks to Tobias Spitaler for suggesting this and to @richardotis for brainstorming this solution concept.

    This PR introduces two new functions in ZPF error, _solve_sitefracs_composition and _sample_solution_constitution. Their purpose is to facilitate computing metastable or unstable single phase driving forces when a phase has a miscibility gap. This should improve the convergence for any phase that has a stable or metastable miscibility gap.

    Rationale

    ESPEI currently computes the "single-phase hyperplane" at a vertex by performing an equilibrium calculate at a black point and then subtracting that from the target hyperplane energy at that composition. As illustrated in the figure Tobias constructed (below), this is problematic for phases with a miscibility gap because a "single-phase" equilibrium calculation in pycalphad will always compute the global minimum energy and give two composition sets.

    driving-force-Spitaler

    What ESPEI should do is what Tobias illustrates by the orange x and the green driving force line. This solution ensures that minimizing the driving force will force the Gibbs energy curve to match the energy of the black points on the multi-phase target hyperplane.

    Historically, we didn't implement this because one would like to use equilibrium to minimize the internal degeres of freedom, but pycalphad always computes the global minimum energy, so it was not possible to do via equilibrium. More recently, ESPEI had introduced the idea of approximate_equilibrium, which uses starting_point to more quickly determine a minimum energy solution from a discrete point smapling grid. The approximate_equilibrium method we use still has the same problem as pycalphad's equilibrium because starting_point will still give the global minimum solution for the discrete sampling.

    Solution

    In an ideal world, pycalphad should be able to turn off global minimization (automatically introducing new composition sets) and enable a condition to be set for the composition of a phase, i.e. X(BCC,B). In practice, being able to turn off global minimum and provide a valid starting point for only one composition set that has a global composition condition would simulate a phase composition condition. Unfortunately, neither turning off global minimization nor phase composition conditions are currently implemented. So we need to do a workaround.

    The two functions introduced here consider each single phase composition at a tie-vertex and construct a point grid that only contains points which satisfy the prescribed overall composition (and the internal phase constraints). This can be used in either approximate or exact equilibrium modes to find lowest energy starting point and then to pass that equilibrium with the constrained point grid so the global minimization step has no new composition sets to introduce (i.e. it cannot detect a miscibility gap).

    For perfomance, we pre-compute the grid of points for every phase composition in the ZPF datasets and re-use them to compute the grid, starting point and equilibrium at every parameter iteration (note that this would be invalid if a parameter changes the number of moles, like varying coordination number in the MQMQA).

    To summarize the impact:

    1. This method will be entirely backwards compatible for phases without a miscibility gap.
    2. For cases where a miscibility gap is present in the parameters, but a single phase is prescribed, there will be a driving force to eliminate the miscibility gap, so the single phase compositions are more meaningful too. This is significant because you can prescribe single phase regions in ZPF datasets and it will enforce that no miscibility gap occurs, which is not true today.
    3. For phase compositions inside a miscibilty gap, the Gibbs energy curve will match the multi-phase global minimum hyperplane at the phase compositions (at convergence).
    opened by bocklund 20
  • ERROR occurred using the new development version

    ERROR occurred using the new development version

    Dear Administrator, There were some tests that failed when I try to run pytest after install the new development version(2021/4/21, Beijing time). Meanwhile, there is some error occurred when I run some example cases that successfully run using other versions before. errorlog.txt pytestfail.txt condalist.txt

    opened by duxiaoxian 12
  • Error releasing un-acquired lock in dask

    Error releasing un-acquired lock in dask

    Was distributed (1.18.0) when this error occurred. Changed to distributed (1.16.3).

      File "/Applications/anaconda/envs/my_pycalphad/bin/espei", line 11, in <module>
        sys.exit(main())
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/run_espei.py", line 135, in main
        mcmc_steps=args.mcmc_steps, save_interval=args.save_interval)
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/paramselect.py", line 754, in fit
        for i, result in enumerate(sampler.sample(walkers, iterations=mcmc_steps)):
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 259, in sample
        lnprob[S0])
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 332, in _propose_stretch
        newlnprob, blob = self._get_lnprob(q)
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 382, in _get_lnprob
        results = list(M(self.lnprobfn, [p[i] for i in range(len(p))]))
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/utils.py", line 39, in map
        result = [x.result() for x in result]
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/utils.py", line 39, in <listcomp>
        result = [x.result() for x in result]
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/distributed/client.py", line 155, in result
        six.reraise(*result)
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/six.py", line 685, in reraise
        raise value.with_traceback(tb)
      File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/distributed/protocol/pickle.py", line 59, in loads
        return pickle.loads(x)
    RuntimeError: cannot release un-acquired lock```
    bug 
    opened by ghost 10
  • dask workers can sometimes die without warning

    dask workers can sometimes die without warning

    I haven't been able to reproduce it consistently, but dark workers sometimes die with the dask scheduler.

    To debug this, I turned on debugging output by scheduler = LocalCluster(n_workers=cores, threads_per_worker=1, processes=True, silence_logs=verbosity[output_settings['verbosity']]).

    I am still waiting for that job to have workers die to see the output, but for now as iterations in emcee complete the results are processed in Python (it is known that this is happening because of the progress bar output). During this time, the LocalCluster debugging gives output

    distributed.core - WARNING - Event loop was unresponsive for 1.69s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
    

    Usually I get two similar messages in a row.

    As another possibility, the most recent time I was able to reproduce this was when I had two instances of ESPEI running at the same time. I wouldn't think that the different client instances would interact, but maybe it should be investigated.

    opened by bocklund 6
  • Issues reproducing Cu-Mg example

    Issues reproducing Cu-Mg example

    I had several issues running the Cu-Mg example from the ESPEI website. I installed ESPEI using the conda command, and took the Cu-Mg data directory from the ESPEI-datasets repository.

    I first tried reproducing the diagram from the section titled, First-principles phase diagram The code successfully ran, but the returned phase diagram didn't match the example well: diagram_dft

    I then tried reproducing the results in the MCMC optimization section. I wasn't able to successfully perform the MCMC optimization. The code returned numerous errors over the course of several minutes and eventually hung with no further output.

    This file contains the full python output when I ran the optimization: espei_mcmc_error.txt

    Here is my python version and installed packages/versions: python_info.txt

    opened by npaulson 6
  • The latest version of espei = 0.7.2 get an error when plot

    The latest version of espei = 0.7.2 get an error when plot

    I have recently used the latest version of espei = 0.7.2 and I always get an error, but I used espei = 0.6 and it works fine. image

    My current computer can't use espei = 0.6 again, so I don't know which version to use, I don't know what went wrong. I always get MPI errors when I use espei = 0.6 image

    AG_CU_1214.zip

    opened by duxiaoxian 5
  • Run ESPEI via input files, rather than command line arguments

    Run ESPEI via input files, rather than command line arguments

    A first draft and feedback was written in this gist

    The current iteration is:

    Header area.
    Include any metadata above the `---`.
    ---
    # core run settings
    run_type: full # choose full | dft | mcmc
    phase_models: input.json
    datasets: input-datasets # path to datasets. Defaults to current directory.
    scheduler: dask # can be dask | MPIPool
    
    # control output
    verbosity: 0 # integer verbosity level 0 | 1 | 2, where 2 is most verbose.
    output_tdb: out.tdb
    tracefile: chain.npy # name of the file containing the mcmc chain array
    probfile: lnprob.npy # name of the file containing the mcmc ln probability array
    
    # the following only take effect for full or mcmc runs
    mcmc:
      mcmc_steps: 2000
      mcmc_save_interval: 100
    
      # the following take effect for only mcmc runs
      input_tdb: null # TDB file used to start the mcmc run
      restart_chain: null # restart the mcmc fitting from a previous calculation
    

    This issue will focus on the development of a first generation input file structure and spec, and also as a place to brainstorm options that should be user-facing.

    opened by bocklund 5
  • Limit the degrees of freedom for non-active phases in MCMC to prevent them from diverging?

    Limit the degrees of freedom for non-active phases in MCMC to prevent them from diverging?

    Phases that do not have phase equilibria data should have their parameters fixed before the MCMC run.

    A particular phase in an ESPEI run can have single phase DFT data and no phase equilibria. This means that the parameters that were calculated in the single phase fitting have no effect on the error function that is used in the MCMC run.

    When parameters have no effect on the error function, they diverge when used in emcee because the ensemble sampler scales them up to infinity in an attempt to force that parameter to affect the error function.

    bug enhancement 
    opened by bocklund 5
  • Error when running Cu-Mg example

    Error when running Cu-Mg example

    Hello, I am trying to run ESPEI for the first time.

    I created a conda env and installed ESPEI using conda. I downloaded json and yaml files as well as the contents of the Cu-Mg folder in ESPEI-datasets, renamed it to input-data. After running espei --input espei-in.yaml, I get the errors below. Could you please let me know if I am doing anything wrong?

    Thanks!

    Traceback (most recent call last):
      File "/Users/latmarat/miniforge3/envs/espenv/bin/espei", line 10, in <module>
        sys.exit(main())
      File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/espei_script.py", line 307, in main
        run_espei(input_settings)
      File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/espei_script.py", line 177, in run_espei
        dbf = generate_parameters(phase_models, datasets, refdata, excess_model,
      File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/paramselect.py", line 517, in generate_parameters
        aliases = extract_aliases(phase_models)
      File "/Users/latmarat/miniforge3/envs/espenv/lib/python3.10/site-packages/espei/utils.py", line 370, in extract_aliases
        aliases = {phase_name: phase_name for phase_name in phase_models["phases"].keys()}
    AttributeError: 'list' object has no attribute 'keys'
    
    opened by latmarat 4
  • AttributeError: 'NoneType' object has no attribute 'values'

    AttributeError: 'NoneType' object has no attribute 'values'

    Dear Administrator, An 'AttributeError' occurred when I run 'espei --input espei-in-2.yaml' using the latest development version of ESPEI. Would you mind help me to check my dataset? Thanks. errorprint-log.txt verbosity-log.txt CO-CU-20201104.zip

    f:\users\zhang\pycalphad\pycalphad\codegen\callables.py:97: UserWarning: State variables in build_callables are not {N, P, T}, but {T, P}. This can lead to incorrectly calculated values if the state variables used to call the generated functions do not match the state variables used to create them. State variables can be added with the additional_statevars argument. "additional_statevars argument.".format(state_variables)) Traceback (most recent call last): File "F:\Users\zhang\Anaconda32020\envs\espei2020test\Scripts\espei-script.py", line 33, in sys.exit(load_entry_point('espei', 'console_scripts', 'espei')()) File "f:\users\zhang\espei\espei\espei_script.py", line 311, in main run_espei(input_settings) File "f:\users\zhang\espei\espei\espei_script.py", line 260, in run_espei approximate_equilibrium=approximate_equilibrium, File "f:\users\zhang\espei\espei\optimizers\opt_base.py", line 36, in fit node = self.fit(symbols, datasets, *args, **kwargs) File "f:\users\zhang\espei\espei\optimizers\opt_mcmc.py", line 238, in fit self.predict(initial_guess, **ctx) File "f:\users\zhang\espei\espei\optimizers\opt_mcmc.py", line 289, in predict multi_phase_error = calculate_zpf_error(parameters=np.array(params), **zpf_kwargs) File "f:\users\zhang\espei\espei\error_functions\zpf_error.py", line 315, in calculate_zpf_error target_hyperplane = estimate_hyperplane(phase_region, parameters, approximate_equilibrium=approximate_equilibrium) File "f:\users\zhang\espei\espei\error_functions\zpf_error.py", line 186, in estimate_hyperplane grid = calculate(dbf, species, phases, str_statevar_dict, models, phase_records, pdens=500, fake_points=True) File "f:\users\zhang\espei\espei\shadow_functions.py", line 55, in calculate largest_energy=float(1e10), fake_points=fp) File "f:\users\zhang\pycalphad\pycalphad\core\calculate.py", line 190, in _compute_phase_values param_symbols, parameter_array = extract_parameters(parameters) File "f:\users\zhang\pycalphad\pycalphad\core\utils.py", line 361, in extract_parameters parameter_array_lengths = set(np.atleast_1d(val).size for val in parameters.values()) AttributeError: 'NoneType' object has no attribute 'values'

    opened by duxiaoxian 4
  • Migrate pycalphad refdata to ESPEI

    Migrate pycalphad refdata to ESPEI

    Tracking from https://github.com/pycalphad/pycalphad/issues/120

    Assume that SGTE91Stable is correct per https://github.com/pycalphad/pycalphad/issues/120. Then we must

    • [x] Remove the metastable phases not present in the SGTE91 original paper
    • [ ] Check that remaining phases have correct descriptions
    opened by bocklund 4
  • MCMC Initialized chains should include initial point

    MCMC Initialized chains should include initial point

    During the initialization of the chains for the MCMC optimizer, a Gaussian distribution about an initial point is taken. https://github.com/PhasesResearchLab/ESPEI/blob/7c797191d4c3178fe4a22275bbaee9c2977786ad/espei/optimizers/opt_mcmc.py#L98

    I would suggest including the initial point in that set of initial chains. If everything is set up correctly, this won't matter, but for cases where the standard deviation is too high while the initial guess is quite good, the current behavior will lead to a lot of bad starting points. Modifying the initial set to include the initial guess point should ensure that at least this state (or acceptable permutations of it) will survive the MCMC run. What do you think?

    opened by toastedcrumpets 0
  • formatted_parameter broken by SymEngine

    formatted_parameter broken by SymEngine

    Switching the symbolic backend to SymEngine broke espei.utils.formatted_parameter. Here's a test to validate (run from the tests directory for the testing_data module to be importable).

    # espei/tests/test_utils.py
    
    from pycalphad import Database
    from espei.utils import formatted_parameter, database_symbols_to_fit
    from .testing_data import CU_MG_TDB
    def test_cu_mg_parameters_can_be_formatted_to_strings():
        """Formating parameters should work for common variables parameters"""
        dbf = Database(CU_MG_TDB)
        for sym in database_symbols_to_fit(dbf):
            assert isinstance(formatted_parameter(dbf, sym), str), f"Formatted parameter for symbol {sym} (value = {dbf.symbols[sym]}) in database not a string"
    

    Running this gives an error:

    Traceback (most recent call last):
      File "/Users/bocklund1/src/calphad/espei/tests/dummy.py", line 11, in <module>
        test_cu_mg_parameters_can_be_formatted_to_strings()
      File "/Users/bocklund1/src/calphad/espei/tests/dummy.py", line 9, in test_cu_mg_parameters_can_be_formatted_to_strings
        assert isinstance(formatted_parameter(dbf, sym), str), f"Formatted parameter for symbol {sym} (value = {dbf.symbols[sym]}) in database not a string"
      File "/Users/bocklund1/src/calphad/espei/espei/utils.py", line 295, in formatted_parameter
        term = parameter_term(result['parameter'], symbol)
      File "/Users/bocklund1/src/calphad/espei/espei/utils.py", line 218, in parameter_term
        coeff, root = term_coeff.as_coeff_mul(symbol)
    AttributeError: 'symengine.lib.symengine_wrapper.Symbol' object has no attribute 'as_coeff_mul'
    

    I think the breakage might be because espei.utils.parameter_term isn't correctly picking up the first condition, since for the case of symbol being a symengine.lib.symengine_wrapper.Symbol, I think expression == symbol should evaluate to true, but evidently (via the traceback) it is evaluating to false.

    opened by bocklund 0
  • Memory leak when running MCMC in parallel

    Memory leak when running MCMC in parallel

    Due to a known memory leak when instantiating subclasses of SymEngine (one of our upstream dependencies) Symbol objects (see https://github.com/symengine/symengine.py/issues/379), running ESPEI with parallelization will cause memory to grow in each worker.

    Only running in parallel will trigger significant memory growth, because running in parallel uses the pickle library to serialize and deserialize symbol objects and create new objects that can't be freed. When running without parallelization (mcmc.scheduler: null), new symbols are not created.

    Until https://github.com/symengine/symengine.py/issues/379 is fixed, some mitigation strategies to avoid running out of memory are:

    • Run ESPEI without parallelization by setting scheduler: null
    • (Under consideration to implement): when parallelization is active, use an option to restart the workers every N iterations.
    • (Under consideration to implement): remove Model objects from the keyword arguments of ESPEI's likelihood functions. Model objects contribute a lot of symbol instances in the form of v.SiteFraction objects. We should be able to get away with only using PhaseRecord objects, but there are a few places Model.constituents to be able to infer the sublattice model and internal degrees of freedom that would need to be rewritten.
    opened by bocklund 1
  • Unable to use activity data in binary Fe-C with Graphite as reference state

    Unable to use activity data in binary Fe-C with Graphite as reference state

    Hi,

    We are currently trying to use activity data for Fe-C. Lobo1976 measured the activity of C in alpha-iron relative to Graphite as the standard state, but get erroneous results. (Lobo, Joseph A., and Gordon H. Geiger. "Thermodynamics and solubility of carbon in ferrite and ferritic Fe-Mo alloys." Metallurgical Transactions A 7.8 (1976): 1347-1357.)

    I have added the input file below. With this input file, we get chemical potential difference: [nan] (verbosity 3 output). Is the input file correct or are we missing something? I have had a look at the value of ref_result within the activity_error.py and this does give only nan results for the specified reference state. Graphite only has C as a component. An equilibrium calculation of Graphite specifying x.V('C') gives an error as Number of dependent components different from one. Can this cause an error here as well? Used versions: espei: 0.8.6 and pycalphad 0.9.2. I have added a zip-file with the TDB file and espei input files which reproduces this behaviour.

    Thank you for your help, Tobias

    {
            "components": ["FE", "C", "VA"],
            "phases": ["BCC_A2", "GRAPHITE"],
            "weight": 1000,
            "reference_state": {
                    "phases": ["GRAPHITE"],
                    "conditions": {
                            "P": 101325,
                            "T": 1056.15,
                            "X_C": 1
    
                    }
            },
            "conditions": {
                    "P": 101325,
                    "T": 1056.15,
                    "X_C": [0.00013017]
            },
            "output": "ACR_C",
            "values": [[[0.087]]
                    ],
            "reference": "Lobo1976_1056K",
            "meta_data": {
                    "DOI": "10.1007/BF02658820",
                    "literature reference": "Thermodynamics and Solubility of Carbon in Ferrite and Ferritic Fe-Mo Alloys",
                    "table/figure": "table 1",
                    "measured data": "C-activity in Alpha-Iron",
                    "experimental details": "not available",
                    "weight": "default"
            }
    }
    

    minimal_example.zip

    opened by tobiasspt 1
  • ENH: Allow multiple datasets directories to be specified in YAML input

    ENH: Allow multiple datasets directories to be specified in YAML input

    Sometimes it is useful to load datasets from different filesystem locations, for example if one folder contains hand-curated data and another contains automatically generated data.

    In code, it would be pretty simple to handle this. Instead of

    from espei.datasets import load_datasets, recursive_glob
    directory = '/path/to/directory/'
    load_datasets(recursive_glob(directory))
    

    we could do

    from itertools import chain
    from espei.datasets import load_datasets, recursive_glob
    directories = ['/path/to/directory_1/', '/path/to/directory_2/']
    load_datasets(chain(*map(recursive_glob, directories)))
    
    opened by bocklund 1
Releases(0.8.9)
Owner
Phases Research Lab
Research group lead by Dr. Zi-Kui Liu at The Pennsylvania State University.
Phases Research Lab
A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

SymPy 9.9k Dec 31, 2022
Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle.

2019-indian-election-eda Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle. This project is a part of the Cou

Souradeep Banerjee 5 Oct 10, 2022
LynxKite: a complete graph data science platform for very large graphs and other datasets.

LynxKite is a complete graph data science platform for very large graphs and other datasets. It seamlessly combines the benefits of a friendly graphical interface and a powerful Python API.

124 Dec 14, 2022
pyhsmm MITpyhsmm - Bayesian inference in HSMMs and HMMs. MIT

Bayesian inference in HSMMs and HMMs This is a Python library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and expli

Matthew Johnson 527 Dec 04, 2022
fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

DAGsHub 359 Dec 22, 2022
A set of functions and analysis classes for solvation structure analysis

SolvationAnalysis The macroscopic behavior of a liquid is determined by its microscopic structure. For ionic systems, like batteries and many enzymes,

MDAnalysis 19 Nov 24, 2022
This is an analysis and prediction project for house prices in King County, USA based on certain features of the house

This is a project for analysis and estimation of House Prices in King County USA The .csv file contains the data of the house and the .ipynb file con

Amit Prakash 1 Jan 21, 2022
Detecting Underwater Objects (DUO)

Underwater object detection for robot picking has attracted a lot of interest. However, it is still an unsolved problem due to several challenges. We take steps towards making it more realistic by ad

27 Dec 12, 2022
Statistical Rethinking course winter 2022

Statistical Rethinking (2022 Edition) Instructor: Richard McElreath Lectures: Uploaded Playlist and pre-recorded, two per week Discussion: Online, F

Richard McElreath 3.9k Dec 31, 2022
A forecasting system dedicated to smart city data

smart-city-predictions System prognostyczny dedykowany dla danych inteligentnych miast Praca inżynierska realizowana przez Michała Stawikowskiego and

Kevin Lai 1 Nov 08, 2021
University Challenge 2021 With Python

University Challenge 2021 This repository contains: The TeX file of the technical write-up describing the University / HYPER Challenge 2021 under late

2 Nov 27, 2021
Convert monolithic Jupyter notebooks into Ploomber pipelines.

Soorgeon Join our community | Newsletter | Contact us | Blog | Website | YouTube Convert monolithic Jupyter notebooks into Ploomber pipelines. soorgeo

Ploomber 65 Dec 16, 2022
A Python package for the mathematical modeling of infectious diseases via compartmental models

A Python package for the mathematical modeling of infectious diseases via compartmental models. Originally designed for epidemiologists, epispot can be adapted for almost any type of modeling scenari

epispot 12 Dec 28, 2022
💬 Python scripts to parse Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames.

Chatistics Python 3 scripts to convert chat logs from various messaging platforms into Pandas DataFrames. Can also generate histograms and word clouds

Florian 893 Jan 02, 2023
A collection of learning outcomes data analysis using Python and SQL, from DQLab.

Data Analyst with PYTHON Data Analyst berperan dalam menghasilkan analisa data serta mempresentasikan insight untuk membantu proses pengambilan keputu

6 Oct 11, 2022
Hidden Markov Models in Python, with scikit-learn like API

hmmlearn hmmlearn is a set of algorithms for unsupervised learning and inference of Hidden Markov Models. For supervised learning learning of HMMs and

2.7k Jan 03, 2023
A meta plugin for processing timelapse data timepoint by timepoint in napari

napari-time-slicer A meta plugin for processing timelapse data timepoint by timepoint. It enables a list of napari plugins to process 2D+t or 3D+t dat

Robert Haase 2 Oct 13, 2022
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences

Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. Copula and functional Principle Component Analysis (fPCA) are st

32 Dec 20, 2022
The repo for mlbtradetrees.com. Analyze any trade in baseball history!

The repo for mlbtradetrees.com. Analyze any trade in baseball history!

7 Nov 20, 2022
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.

tree-SNE t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in s

Isaac Robinson 61 Nov 21, 2022