Statsmodels: statistical modeling and econometrics in Python

Overview

PyPI Version Conda Version License Azure CI Build Status Codecov Coverage Coveralls Coverage PyPI downloads Conda downloads

About statsmodels

statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.

Documentation

The documentation for the latest release is at

https://www.statsmodels.org/stable/

The documentation for the development version is at

https://www.statsmodels.org/dev/

Recent improvements are highlighted in the release notes

https://www.statsmodels.org/stable/release/version0.9.html

Backups of documentation are available at https://statsmodels.github.io/stable/ and https://statsmodels.github.io/dev/.

Main Features

  • Linear regression models:
    • Ordinary least squares
    • Generalized least squares
    • Weighted least squares
    • Least squares with autoregressive errors
    • Quantile regression
    • Recursive least squares
  • Mixed Linear Model with mixed effects and variance components
  • GLM: Generalized linear models with support for all of the one-parameter exponential family distributions
  • Bayesian Mixed GLM for Binomial and Poisson
  • GEE: Generalized Estimating Equations for one-way clustered or longitudinal data
  • Discrete models:
    • Logit and Probit
    • Multinomial logit (MNLogit)
    • Poisson and Generalized Poisson regression
    • Negative Binomial regression
    • Zero-Inflated Count models
  • RLM: Robust linear models with support for several M-estimators.
  • Time Series Analysis: models for time series analysis
    • Complete StateSpace modeling framework
      • Seasonal ARIMA and ARIMAX models
      • VARMA and VARMAX models
      • Dynamic Factor models
      • Unobserved Component models
    • Markov switching models (MSAR), also known as Hidden Markov Models (HMM)
    • Univariate time series analysis: AR, ARIMA
    • Vector autoregressive models, VAR and structural VAR
    • Vector error correction model, VECM
    • exponential smoothing, Holt-Winters
    • Hypothesis tests for time series: unit root, cointegration and others
    • Descriptive statistics and process models for time series analysis
  • Survival analysis:
    • Proportional hazards regression (Cox models)
    • Survivor function estimation (Kaplan-Meier)
    • Cumulative incidence function estimation
  • Multivariate:
    • Principal Component Analysis with missing data
    • Factor Analysis with rotation
    • MANOVA
    • Canonical Correlation
  • Nonparametric statistics: Univariate and multivariate kernel density estimators
  • Datasets: Datasets used for examples and in testing
  • Statistics: a wide range of statistical tests
    • diagnostics and specification tests
    • goodness-of-fit and normality tests
    • functions for multiple testing
    • various additional statistical tests
  • Imputation with MICE, regression on order statistic and Gaussian imputation
  • Mediation analysis
  • Graphics includes plot functions for visual analysis of data and model results
  • I/O
    • Tools for reading Stata .dta files, but pandas has a more recent version
    • Table output to ascii, latex, and html
  • Miscellaneous models
  • Sandbox: statsmodels contains a sandbox folder with code in various stages of development and testing which is not considered "production ready". This covers among others
    • Generalized method of moments (GMM) estimators
    • Kernel regression
    • Various extensions to scipy.stats.distributions
    • Panel data models
    • Information theoretic measures

How to get it

The master branch on GitHub is the most up to date code

https://www.github.com/statsmodels/statsmodels

Source download of release tags are available on GitHub

https://github.com/statsmodels/statsmodels/tags

Binaries and source distributions are available from PyPi

https://pypi.org/project/statsmodels/

Binaries can be installed in Anaconda

conda install statsmodels

Installing from sources

See INSTALL.txt for requirements or see the documentation

https://statsmodels.github.io/dev/install.html

Contributing

Contributions in any form are welcome, including:

  • Documentation improvements
  • Additional tests
  • New features to existing models
  • New models

https://www.statsmodels.org/stable/dev/test_notes

for instructions on installing statsmodels in editable mode.

License

Modified BSD (3-clause)

Discussion and Development

Discussions take place on the mailing list

https://groups.google.com/group/pystatsmodels

and in the issue tracker. We are very interested in feedback about usability and suggestions for improvements.

Bug Reports

Bug reports can be submitted to the issue tracker at

https://github.com/statsmodels/statsmodels/issues

Issues
  • Gam gsoc2015

    Gam gsoc2015

    @josef-pkt I am starting a PR . At the moment there is the gam file that contains the gam penalty class. Smooth_basis contains some functions to get bsplines and polynomial basis. This file will be removed when we will be able to use directly patsy.

    There are also 2 files that contains examples or small scripts. We will remove them later.

    Let me know what do you think about that.

    Todo

    • [ ] predict errors (stateful transform, patsy ?), Note fittedvalues are available only a problem in pirls example, predict after fit works, requires spline basis values
    • [ ] get_prediction also errors, maybe consequence of predict error currently errors because weights is None
    • [ ] check test coverage for offset and exposure

    Interface

    • [ ] partial_values plot_partial has inconvenient arguments (smoother and mask) instead of column index or term name or similar
    • [ ] formula-like interface for predict (create spline basis values internally)
    • [ ] adjust inherited methods like plot_partial_residuals (after GSOC?)
    • [ ] default param names for splines (should be more informative than "xi" (i is range(k)), but less (?) verbose than patsy's), need them for test_terms
    type-enh comp-base comp-genmod 
    opened by DonBeo 232
  • Multivariate Kalman Filter

    Multivariate Kalman Filter

    Here's a simple branch with the code added into statsmodels.tsa.statespace. A couple of thoughts

    • At least in the dev process, I thought it might be nicer to keep it in its own module, rather than putting it with the kalmanf. I don't know what makes the most sense in the long run.
    • I have unit tests that rely on the statespace model, but I'm rewriting them so the KF pull request can be done on its own without other dependencies, especially since the statespace model is likely to change.

    Edit: Original line comments.

    L72

    Question: do we want to keep the single-precision version (and the complex single precision, below)? I don't really see a use case, and it appears from preliminary tests that the results tend to overflow. Maybe I'll post a unit test to demonstrate and we can go from there.

    L332

    Question: there are a bunch of ways to initialize the KF, depending on the theory underlying the data. This one is only valid for stationary processes. Probably best to move out to the Python wrapper?

    L444 Question: I think we'll want to add the ability to specify whether or not to check for convergence and alter the tolerance.

    L414

    This inversion is using an LU decomposition, but I think in this case I can rely on f to be positive definite since it's the covariance matrix of the forecast error, in which case I could use the much faster Cholesky decomposition approach. This is something I'm looking into, but if you happen to know one way or the other, that would be great too.

    Related to this is the idea that you should "never take inverses", and I guess I need to look into replacing this with a linear solver routine if possible, in the updating stage below.

    comp-tsa type-enh 
    opened by ChadFulton 213
  • New kernel_methods module for new KDE implementation

    New kernel_methods module for new KDE implementation

    This is a new version of the kernel density estimation. The purpose is to provide an implementation that is faster in the case of grid evaluation, and also works on bounded domains.

    There is still some work to do, in particular, I need to add tests for the multi-dimensional and discrete densities.

    type-enh comp-nonparametric type-refactor 
    opened by PierreBdR 183
  • GSOC2017 Zero-Inflated models

    GSOC2017 Zero-Inflated models

    This model include following models:

    1. Generic Zero Inflated model
    2. Zero-Inflated Poisson model
    3. Zero-Inflated Generalized Poisson model (ZIGP-P)
    4. Zero-Inflated Generalized Negative Binomial model (ZINB-P)

    Each model include this parts:

    • [x] LLF
    • [x] Score
    • [x] Hessian
    • [x] Predict
    • [x] Fit
    • [x] Docs
    • [x] Tests

    Status: - reviewing, need to implement better way to generate start_params Last commit for end of GSoC17: Changed way to find start params

    type-enh comp-discrete 
    opened by evgenyzhurko 170
  • GSOC2017 Generalized Poisson (GP-P) model

    GSOC2017 Generalized Poisson (GP-P) model

    This PR include implementation of Generalized Poisson model This model include this parts:

    • [x] Log-likelihood function
    • [x] Score
    • [x] Hessian
    • [x] Fit
    • [x] Result
    • [x] Tests
    • [x] Docs

    Status - merged #3795

    rejected 
    opened by evgenyzhurko 149
  • GSoC 2016: State-Space Models with Markov Switching

    GSoC 2016: State-Space Models with Markov Switching

    Hi, I have started implementing Kim Filter, outlined a basic functionality, as described in Kim-Nelson book (see diagram on p. 105). I didn't even run the code yet to check for errors. Coding style and class interface bother me more for the moment, as well as the possible ways to test it without implementing models.

    comp-tsa type-enh 
    opened by ValeryTyumen 129
  • ENH: Revise loglike/deviance to correctly handle scale

    ENH: Revise loglike/deviance to correctly handle scale

    xref #3773

    I apologize for this taking longer than had hoped, but my busy period was exacerbated by some unforeseen events. But I'm somewhat back now.

    This is the first step to solving the above issue.

    Still a lot is missing... I've only really gotten loglike to work for all the families except Gamma, Binomial, Gaussian, and Tweedie (which has never had loglike). I also need to re-work the docstrings. I'm thinking its more logical to have a loglike_obs function called in each family and then have a loglike function in the Family class so that loglike will simply be inherited. I think you the doctrings could be elegantly written to handle this too.

    I'm thinking I will work on deviance next and then ciricle back to loglike... I feel like R takes some computational shortcuts...

    type-enh comp-genmod type-refactor 
    opened by thequackdaddy 125
  • WIP/ENH Gam 2744 rebased2

    WIP/ENH Gam 2744 rebased2

    rebased version of #4575 which was rebased version of #2744 The original GSOC PR with most of the development discussion is #2435

    rebase conflict in compat.python: This has now unneeded itertools.combinations import but I dropped the py 2.6 compat code.

    type-enh comp-genmod 
    opened by josef-pkt 104
  • ENH: Add var_weights to GLM

    ENH: Add var_weights to GLM

    Hello,

    xref #3371

    This (should) get var_weights to work for GLM. I added a Tweedie usecase against R.

    I can't seem to get the Poisson with var_weights to HC0 test that was added (but disabled because of the lack of functionality) to work. The test is here:

    https://github.com/thequackdaddy/statsmodels/blob/var_weight/statsmodels/genmod/tests/test_glm_weights.py#L142

    It fails on the following assert

    assert_allclose(res1.bse, corr_fact * res2.bse, atol= 1e-6, rtol=2e-6)

    Its relatively close... consistently off by a factor of 0.98574... corr_fact brings it much closer... I'm wondering if another adjustment is necessary?

    Honestly, I don't really understand much about sandwiches (except PBJ, meatball, and bánh mì).

    Rest of the test seem to work pretty well.

    I'd be happy to have this merged relatively soon, so thanks for the feedback and review!

    type-enh comp-genmod topic-weights 
    opened by thequackdaddy 101
  • [MRG] Add MANOVA class

    [MRG] Add MANOVA class

    PR as discussed in #3274. Tested with a SAS example and two R examples, produced the same results.

    To-do:

    • [X] Core stats computation
    • [x] api
    • [x] automatic create dummy variable and hypothesis testing for categorical type independent variables.
    • [x] Input validation
    • [x] More examples to be tested

    references: [1] https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_introreg_sect012.htm) [2] GLM Algorithms ftp://public.dhe.ibm.com/software/analytics/spss/documentation/statistics/20.0/en/client/Manuals/IBM_SPSS_Statistics_Algorithms.pdf [3] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.278.6976&rep=rep1&type=pdf

    Unit test in test_MultivariateOLS.py

    compare_r_output_dogs_data()

    It reproduces results from the following R code:

    library(car)
    Drug = c('Morphine', 'Morphine', 'Morphine', 'Morphine', 'Morphine', 'placebo', 'placebo', 'placebo', 'placebo', 'placebo', 'Trimethaphan', 'Trimethaphan', 'Trimethaphan', 'Trimethaphan', 'Trimethaphan')
    Depleted = c('N', 'N', 'N', 'N', 'Y', 'Y', 'Y', 'N', 'N', 'N', 'N', 'Y', 'Y', 'Y', 'Y')
    subject  = c(1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
    Histamine0 = c(-3.218876, -3.912023, -2.65926, -1.771957, -2.302585, -2.65926, -2.995732, -3.506558, -3.506558, -2.65926, -2.407946, -2.302585, -2.525729, -2.040221, -2.813411)
    Histamine1 = c(-1.609438, -2.813411, 0.336472, -0.562119, -2.407946, -2.65926, -2.65926, -0.478036, 0.04879, -0.18633, 1.141033, -2.407946, -2.407946, -2.302585, -2.995732)
    Histamine3 = c(-2.302585, -3.912023, -0.733969, -1.049822, -2.040221, -2.813411, -2.813411, -1.171183, -0.314711, 0.067659, 0.722706, -2.407946, -2.407946, -2.120264, -2.995732)
    Histamine5 = c(-2.525729, -3.912023, -1.427116, -1.427116, -1.966113, -2.65926, -2.65926, -1.514128, -0.510826, -0.223144, 0.207014, -2.525729, -2.302585, -2.120264, -2.995732)
    data = data.frame(Histamine0, Histamine1, Histamine3, Histamine5)
    hismat = as.matrix(data[,1:4])
    result = lm(hismat ~ Drug * Depleted)
    linearHypothesis(result, c(1, 0, 0, 0, 0, 0)) 
    linearHypothesis(result, t(cbind(c(0, 1, 0, 0, 0, 0), c(0, 0, 1, 0, 0, 0))))
    linearHypothesis(result, c(0, 0, 0, 1, 0, 0)) 
    linearHypothesis(result, t(cbind(c(0, 0, 0, 0, 1, 0), c(0, 0, 0, 0, 0, 1))))
    # Or ManRes <- Manova(result, type="III")
    

    test_affine_hypothesis()

    It reproduces results from the following R code:

    result = lm(hismat ~ Drug*Depleted)
    fml = t(cbind(c(0, 1.1, 1.2, 1.3, 1.4, 1.5), c(0, 2.1, 3.2, 3.3, 4.4, 5.5)))
    linearHypothesis(result, fml, t(cbind(c(1, 2, 3, 4), c(5, 6, 7, 8))), verbose=TRUE)
    
    type-enh comp-multivariate 
    opened by yl565 101
  • Distributed estimation

    Distributed estimation

    Ok, I wanted to make a PR for this, there are still a couple of things that I need to fix but things are pretty close to done and I'm happy with the current approach.

    The key part of the current approach is a function distributed_estimation. This function works by taking a generator for endog and exog, endog_generator and exog_generator, as well as a series of functions and key word arguments to be run on each machine and then used to recombine the results. The generator approach allows for a variety of use cases and can handle a lot of the ideas discussed in the initial proposal. For each data set yielded by the generators a model is initialized using model_class and init_kwds and then the function estimation_method is applied to the model along with the key words fit_kwds and estimation_kwds. Finally, the results are recombined from each data set using join_method.

    Currently, this defaults to the distributed regularized approach discussed here:

    http://arxiv.org/pdf/1503.04337v3.pdf

    but the way I've set things up means that the user should be able to apply any number of procedures here.

    The current todo list:

    • [x] Fix hess_obs
    • [x] Fix joblib fit
    • [x] Add dask fit
    • [x] Add WLS/GLS for debiased regularized fit
    • [x] Add likelihood result
    • [x] Add data tests

    Let me know if there are any comments/questions/criticisms, as I've said, it certainly isn't complete but I wanted to get this out there so I can start integrating any changes as I finish it up.

    type-enh comp-base comp-genmod comp-regression 
    opened by lbybee 85
  • ENH: Functions for summarizing multiple model fitting results.

    ENH: Functions for summarizing multiple model fitting results.

    • [ ] closes (None)
    • [ ] tests (not tested)
    • [x] code/documentation is well formatted.
    • [x] properly formatted commit message. See NumPy's guide.

    Although there is a question about how to handle multiple summaries with summary_col. Python: Do not show dummies in statsmodels summary

    However, I thought more featured functions to achieve easy functions of handling with multiple fitting results were needed, and I tried to implement simple one with just horizontal concatenations, and one version which lets a mosaic variable to control layout of outputs.

    See the link below for practical usage. https://github.com/toshiakiasakura/statsmodels/blob/summary_multi/examples/notebooks/summary_multi_usage.ipynb

    If there is a further suggestion, improvement or need to open a issue, please tell me.

    opened by toshiakiasakura 2
  • DOC/REF/ENH: follow-up to merging treatment effect

    DOC/REF/ENH: follow-up to merging treatment effect

    I merged treatment effect #8034 essentially in the version where I stopped 4 months ago

    Docs still need a lot of work, and we need a notebook (unit test has full example). Simple example in class docstring would also be helpful. The current docs are not very informative, no explanations and details (users would have to read the Stata docs) TreatmentEffectResults class has no docstring.

    The docs don't indicate that only OLS is supported as outcome model. Need to warn or raise for current limitations. current unit tests are OLS + Probit I guess GLM-Binomial is not supported for treatment model, only Logit and Probit. ???

    from_data method is not implemented

    type-enh comp-docs comp-treatment 
    opened by josef-pkt 3
  • statsmodels.distributions.edgeworth.ExpandedNormal - 4th cumulant higher than 4

    statsmodels.distributions.edgeworth.ExpandedNormal - 4th cumulant higher than 4

    In line 166 of edgeworth.py is checked if the imag-part is zero and if abs(r) is smaller than 4. If I choose a fourth cumulant higher than 4 I get this warning and yes in this case the imag-parts are zero and abs(r) is smaller 4. But: where does this limit of 4 comes from? Is it based on a paper? By the way: I assume that the first cumulant is zero and the second is 1. (scaling, centering in line 163,164) I use statsmodels 0.13.2

    comp-distributions 
    opened by cube2022 7
  • MAINT: Deprecate some links

    MAINT: Deprecate some links

    Deprecate lower case links in favor of upper case links

    • [ ] closes #xxxx
    • [ ] tests added / passed.
    • [ ] code/documentation is well formatted.
    • [ ] properly formatted commit message. See NumPy's guide.

    Notes:

    • It is essential that you add a test when making code changes. Tests are not needed for doc changes.
    • When adding a new function, test values should usually be verified in another package (e.g., R/SAS/Stata).
    • When fixing a bug, you must add a test that would produce the bug in main and then show that it is fixed with the new code.
    • New code additions must be well formatted. Changes should pass flake8. If on Linux or OSX, you can verify you changes are well formatted by running
      git diff upstream/main -u -- "*.py" | flake8 --diff --isolated
      

      assuming flake8 is installed. This command is also available on Windows using the Windows System for Linux once flake8 is installed in the local Linux environment. While passing this test is not required, it is good practice and it help improve code quality in statsmodels.

    • Docstring additions must render correctly, including escapes and LaTeX.
    opened by bashtage 2
  • REF: deprecate lower case link classes, duplicates that differ only in capitalization

    REF: deprecate lower case link classes, duplicates that differ only in capitalization

    We have many links with lower case names that are duplicate, aliases of link classes with capitalized class names.

    We should deprecate them with longer deprecation cycle given that both types of names are commonly used. Removing names that only differ in capitalization, will remove many doc build warnings.

    Problem is that user provides a link instance, so we cannot just switch link names internally in GLM, GEE. We need to figure out where to put the deprecation warning. (I haven't checked yet.

    comp-genmod type-refactor backwards-incompat 
    opened by josef-pkt 0
Releases(v0.13.2)
  • v0.13.2(Feb 8, 2022)

    The statsmodels developers are happy to announce the bugfix release for the 0.13 branch. This release fixes 10 bugs and provides protection against changes in recent versions of upstream packages.

    Source code(tar.gz)
    Source code(zip)
  • v0.13.1(Nov 12, 2021)

    The statsmodels developers are happy to announce the bug fix release for the 0.13 branch. This release fixes 8 bugs and brings initial support for Python 3.10.

    Source code(tar.gz)
    Source code(zip)
  • v0.13.0(Oct 1, 2021)

    The statsmodels developers are happy to announce the first release candidate for 0.13.0. 227 issues were closed in this release and 143 pull requests were merged. Major new features include:

    • Autoregressive Distributed Lag models
    • Copulas
    • Ordered Models (Ordinal Regression)
    • Beta Regression
    • Improvements to ARIMA estimation options
    Source code(tar.gz)
    Source code(zip)
  • v0.13.0rc0(Sep 17, 2021)

    The statsmodels developers are happy to announce the first release candidate for 0.13.0. 227 issues were closed in this release and 143 pull requests were merged. Major new features include:

    • Autoregressive Distributed Lag models
    • Copulas
    • Ordered Models (Ordinal Regression)
    • Beta Regression
    • Improvements to ARIMA estimation options
    Source code(tar.gz)
    Source code(zip)
  • v0.12.2(Feb 2, 2021)

    This is a bug-fix release from the 0.12.x branch. Users are encouraged to upgrade.

    Notable changes include fixes for a bug that could lead to incorrect results in forecasts with the new ARIMA model (when d > 0 and trend='t') and a bug in the LM test for autocorrelation.

    Source code(tar.gz)
    Source code(zip)
  • v0.12.1(Oct 29, 2020)

  • v0.12.0(Aug 27, 2020)

    The statsmodels developers are happy to announce release 0.12.0. 239 issues were closed in this release and 221 pull requests were merged.

    Major new features include:

    • New exponential smoothing model: ETS (Error, Trend, Seasonal)
    • New dynamic factor model for large datasets and monthly/quarterly mixed frequency models
    • Decomposition of forecast updates based on the "news"
    • Sparse Cholesky Simulation Smoother
    • Option to use Chandrasekhar recursions
    • Two popular methods for forecasting time series, forecasting after STL decomposition and the Theta model
    • Functions for constructing complex Deterministic Terms in time series models
    • New statistics function: one-way ANOVA-type tests, hypothesis tests for 2-samples and meta-analysis.
    Source code(tar.gz)
    Source code(zip)
  • v0.12.0rc0(Aug 11, 2020)

    The statsmodels developers are happy to announce the first release candidate for 0.12.0. 223 issues were closed in this release and 208 pull requests were merged. Major new features include:

    • New exponential smoothing model: ETS (Error, Trend, Seasonal)
    • New dynamic factor model for large datasets and monthly/quarterly mixed frequency models
    • Decomposition of forecast updates based on the "news"
    • Sparse Cholesky Simulation Smoother
    • Option to use Chandrasekhar recursions
    • Two popular methods for forecasting time series, forecasting after STL decomposition and the Theta model
    • Functions for constructing complex Deterministic Terms in time series models
    Source code(tar.gz)
    Source code(zip)
  • v0.11.1(Feb 21, 2020)

  • v0.11.0(Jan 22, 2020)

    statsmodels developers are happy to announce a new release.

    Major new features include:

    • Regression
      • Rolling OLS and WLS
    • Statistics
      • Oaxaca-Blinder decomposition
      • Distance covariance measures (new in RC2)
      • New regression diagnostic tools (new in RC2)
    • Statespace Models
      • Statespace-based Linear exponential smoothing models¶
      • Methods to apply parameters fitted on one dataset to another dataset¶
      • Method to hold some parameters fixed at known values
      • Option for low memory operations
      • Improved access to state estimates
      • Improved simulation and impulse responses for time-varying models
    • Time-Series Analysis
      • STL Decomposition
      • New AR model
      • New ARIMA model
      • Zivot-Andrews Test
      • More robust regime-switching models

    See release notes for full details.

    Source code(tar.gz)
    Source code(zip)
  • v0.11.0rc2(Jan 15, 2020)

    The second and final release candidate for statsmodels 0.11.

    Major new features include:

    • Regression
      • Rolling OLS and WLS
    • Statistics
      • Oaxaca-Blinder decomposition
      • Distance covariance measures (new in RC2)
      • New regression diagnostic tools (new in RC2)
    • Statespace Models
      • Statespace-based Linear exponential smoothing models¶
      • Methods to apply parameters fitted on one dataset to another dataset¶
      • Method to hold some parameters fixed at known values
      • Option for low memory operations
      • Improved access to state estimates
      • Improved simulation and impulse responses for time-varying models
    • Time-Series Analysis
      • STL Decomposition
      • New AR model
      • New ARIMA model
      • Zivot-Andrews Test
      • More robust regime-switching models

    See release notes for full details.

    Source code(tar.gz)
    Source code(zip)
  • v0.11.0rc1(Dec 18, 2019)

    Release candidate for statsmodels 0.11.

    Major new features include:

    • Regression
      • Rolling OLS and WLS
    • Statistics
      • Oaxaca-Blinder decomposition
    • Statespace Models
      • Statespace-based Linear exponential smoothing models¶
      • Methods to apply parameters fitted on one dataset to another dataset¶
      • Method to hold some parameters fixed at known values
      • Option for low memory operations
      • Improved access to state estimates
      • Improved simulation and impulse responses for time-varying models
    • Time-Series Analysis
      • STL Decomposition
      • New AR model
      • New ARIMA model
      • Zivot-Andrews Test
      • More robust regime switching models

    See release notes for full details.

    Source code(tar.gz)
    Source code(zip)
  • v0.10.2(Nov 23, 2019)

    This is a minor release from the 0.10.x branch with bug fixes and essential maintenance only. The key new feature is:

    • Compatibility with Python 3.8
    Source code(tar.gz)
    Source code(zip)
  • v0.10.1(Jul 19, 2019)

    This is a minor release from the 0.10.x branch with bug fixes and essential maintenance only. The key features are:

    • Compatibility with pandas 0.25
    • Compatibility with Numpy 1.17
    Source code(tar.gz)
    Source code(zip)
  • v0.10.0(Jun 24, 2019)

    This is a major release from 0.9.0 and includes a number new statistical models and many bug fixes.

    Highlights include:

    • Generalized Additive Models. This major feature is experimental and may change.
    • Conditional Models such as ConditionalLogit, which are known as fixed effect models in Econometrics.
    • Dimension Reduction Methods include Sliced Inverse Regression, Principal Hessian Directions and Sliced Avg. Variance Estimation
    • Regression using Quadratic Inference Functions (QIF)
    • Gaussian Process Regression

    See the release notes for a full list of all the change from 0.9.0.

    python -m pip install --upgrade statsmodels

    Note that 0.10.x will likely be the last series of releases to support Python 2, so please consider upgrading to Python 3 if feasible.

    Please report any issues with the release candidate on the statsmodels issue tracker.

    Source code(tar.gz)
    Source code(zip)
  • v0.10.0rc2(Jun 7, 2019)

  • 0.9.0rc1(Apr 30, 2018)

  • v0.8.0rc1(Jun 21, 2016)

Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.1k Jun 28, 2022
Probabilistic Programming and Statistical Inference in PyTorch

PtStat Probabilistic Programming and Statistical Inference in PyTorch. Introduction This project is being developed during my time at Cogent Labs. The

Stefano Peluchetti 107 Jun 2, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 40 Nov 14, 2021
Cockpit is a visual and statistical debugger specifically designed for deep learning.

Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

Felix Dangel 394 Jun 7, 2022
Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression.

Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression. Not an official Google product. Me

Google Research 26 Nov 5, 2021
IDRLnet, a Python toolbox for modeling and solving problems through Physics-Informed Neural Network (PINN) systematically.

IDRLnet IDRLnet is a machine learning library on top of PyTorch. Use IDRLnet if you need a machine learning library that solves both forward and inver

IDRL 69 Jun 20, 2022
Approaches to modeling terrain and maps in python

topography ?? Contains different approaches to modeling terrain and topographic-style maps in python Features Inverse Distance Weighting (IDW) A given

John Gutierrez 1 Jun 17, 2022
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee, Ky

Keon Lee 99 Jun 17, 2022
Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

null 0 Jan 23, 2022
TensorFlow implementation for Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How

Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How TensorFlow implementation for Bayesian Modeling and Unce

Shen Lab at Texas A&M University 4 Apr 26, 2022
A Python library for Deep Probabilistic Modeling

Abstract DeeProb-kit is a Python library that implements deep probabilistic models such as various kinds of Sum-Product Networks, Normalizing Flows an

DeeProb-org 31 Jun 14, 2022
A Python package for performing pore network modeling of porous media

Overview of OpenPNM OpenPNM is a comprehensive framework for performing pore network simulations of porous materials. More Information For more detail

PMEAL 301 Jun 13, 2022
Python framework for Stochastic Differential Equations modeling

SDElearn: a Python package for SDE modeling This package implements functionalities for working with Stochastic Differential Equations models (SDEs fo

null 4 May 10, 2022
[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"

FiGNN for CTR prediction The code and data for our paper in CIKM2019: Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Predicti

Big Data and Multi-modal Computing Group, CRIPAC 73 Jun 7, 2022
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition The official code of ABINet (CVPR 2021, Oral).

null 261 Jun 21, 2022
Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

SSAN Introduction This is the pytorch implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Menti

benfeng 65 May 28, 2022
Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

ProGen - (wip) Implementation and replication of ProGen, Language Modeling for Protein Generation, in Pytorch and Jax (the weights will be made easily

Phil Wang 63 Jun 17, 2022
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.

Note: This is an alpha (preview) version which is still under refining. nn-Meter is a novel and efficient system to accurately predict the inference l

Microsoft 189 Jun 15, 2022
Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

CMU Locus Lab 3.3k Jun 22, 2022