High performance Python GLMs with all the features!

Last update: Dec 14, 2022

Overview

glum

Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!

glum is at least as feature-complete as existing GLM libraries like glmnet or h2o. It supports

Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
L1 regularization, which produces sparse and easily interpretable solutions
L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
Elastic net regularization
Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
Box constraints, linear inequality constraints, sample weights, offsets

This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks module. For details on the benchmarking, see here. Although the performance of glum relative to glmnet and h2o depends on the specific problem, we find that it is consistently much faster for a wide range of problems.

For more information on glum, including tutorials and API reference, please see the documentation.

Why did we choose the name glum? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"

A classic example predicting housing prices

>> >>> # Use only select features >>> X = house_data.data[ ... [ ... "bedrooms", ... "bathrooms", ... "sqft_living", ... "floors", ... "waterfront", ... "view", ... "condition", ... "grade", ... "yr_built", ... "yr_renovated", ... ] ... ].copy() >>> >>> >>> # Model whether a house had an above or below median price via a Binomial >>> # distribution. We'll be doing L1-regularized logistic regression. >>> price = house_data.target >>> y = (price < price.median()).values.astype(int) >>> model = GeneralizedLinearRegressor( ... family='binomial', ... l1_ratio=1.0, ... alpha=0.001 ... ) >>> >>> _ = model.fit(X=X, y=y) >>> >>> # .report_diagnostics shows details about the steps taken by the iterative solver >>> diags = model.get_formatted_diagnostics(full_report=True) >>> diags[['objective_fct']] objective_fct n_iter 0 0.693091 1 0.489500 2 0.449585 3 0.443681 4 0.443498 5 0.443497 ">

>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
...     [
...         "bedrooms",
...         "bathrooms",
...         "sqft_living",
...         "floors",
...         "waterfront",
...         "view",
...         "condition",
...         "grade",
...         "yr_built",
...         "yr_renovated",
...     ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
...     family='binomial',
...     l1_ratio=1.0,
...     alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
        objective_fct
n_iter               
0            0.693091
1            0.489500
2            0.449585
3            0.443681
4            0.443498
5            0.443497

Installation

Please install the package through conda-forge:

conda install glum -c conda-forge

Comments

[Critical] Benchmarks on various data sets

Based on 091fe022af7f8bd2a05210de6cc42bc1030fbb93

Machine: r5.4xlarge (16vCPUs | 128 GB Ram)

I ran the sklearn_fork using

all available data sets
for 1M observations and 18M observations
dense vs sparse
on different numbers of threads
lasso and elnet

Results

                                                                    n_iter    runtime  intercept obj_val rel_obj_val
problem                      num_rows storage threads library                                                       
narrow_insurance_l2_poisson  1000000  dense   1       sklearn_fork       5     2.8577    -3.8547  0.3192           0
                                              2       sklearn_fork       5     2.1641    -3.8547  0.3192           0
                                              4       sklearn_fork       5     1.8084    -3.8547  0.3192           0
                                              8       sklearn_fork       5     1.6458    -3.8547  0.3192           0
                                              16      sklearn_fork       5     1.6295    -3.8547  0.3192           0
                                      sparse  1       sklearn_fork       5     2.4790    -3.8547  0.3192           0
                                              2       sklearn_fork       5     1.9025    -3.8547  0.3192           0
                                              4       sklearn_fork       5     1.5203    -3.8547  0.3192           0
                                              8       sklearn_fork       5     2.4176    -3.8547  0.3192           0
                                              16      sklearn_fork       5     2.4066    -3.8547  0.3192           0
                             18000000 dense   1       sklearn_fork       5    52.9698    -3.8305  0.3189           0
                                              2       sklearn_fork       5    40.7352    -3.8305  0.3189           0
                                              4       sklearn_fork       5    33.9417    -3.8305  0.3189           0
                                              8       sklearn_fork       5    30.6861    -3.8305  0.3189           0
                                              16      sklearn_fork       5    30.5071    -3.8305  0.3189           0
                                      sparse  1       sklearn_fork       5    53.0467    -3.8305  0.3189           0
                                              2       sklearn_fork       5    42.5917    -3.8305  0.3189           0
                                              4       sklearn_fork       5    36.1416    -3.8305  0.3189           0
                                              8       sklearn_fork       5    39.9059    -3.8305  0.3189           0
                                              16      sklearn_fork       5    46.8101    -3.8305  0.3189           0
narrow_insurance_net_poisson 1000000  dense   1       sklearn_fork       9     5.2903    -3.7820  0.3199           0
                                              2       sklearn_fork       9     4.0050    -3.7820  0.3199           0
                                              4       sklearn_fork       9     3.3740    -3.7820  0.3199           0
                                              8       sklearn_fork       9     3.0434    -3.7820  0.3199           0
                                              16      sklearn_fork       9     3.0169    -3.7820  0.3199           0
                                      sparse  1       sklearn_fork       9     4.4610    -3.7820  0.3199           0
                                              2       sklearn_fork       9     3.4356    -3.7820  0.3199           0
                                              4       sklearn_fork       9     2.6984    -3.7820  0.3199           0
                                              8       sklearn_fork       9     3.6768    -3.7820  0.3199           0
                                              16      sklearn_fork       9     4.3575    -3.7820  0.3199           0
                             18000000 dense   1       sklearn_fork       9    98.1811    -3.7678  0.3196           0
                                              2       sklearn_fork       9    74.8176    -3.7678  0.3196           0
                                              4       sklearn_fork       9    63.2101    -3.7678  0.3196           0
                                              8       sklearn_fork       9    57.3406    -3.7678  0.3196           0
                                              16      sklearn_fork       9    56.7188    -3.7678  0.3196           0
                                      sparse  1       sklearn_fork       9    98.1400    -3.7678  0.3196           0
                                              2       sklearn_fork       9    78.0241    -3.7678  0.3196           0
                                              4       sklearn_fork       9    65.7398    -3.7678  0.3196           0
                                              8       sklearn_fork       9    83.0182    -3.7678  0.3196           0
                                              16      sklearn_fork       9    85.0394    -3.7678  0.3196           0
real_insurance_l2_poisson    1000000  dense   1       sklearn_fork       5     5.1556    -3.3377  0.1601           0
                                              2       sklearn_fork       5     3.7674    -3.3377  0.1616           0
                                              4       sklearn_fork       5     3.0280    -3.3377  0.1612           0
                                              8       sklearn_fork       5     2.6941    -3.3377  0.1623           0
                                              16      sklearn_fork       5     2.8711    -3.3377  0.1612           0
                                      sparse  1       sklearn_fork       5    13.8171    -3.3377  0.1618           0
                                              2       sklearn_fork       5     8.8360    -3.3377  0.1619           0
                                              4       sklearn_fork       5     6.4723    -3.3377  0.1608           0
                                              8       sklearn_fork       5     5.5610    -3.3377    0.16           0
                                              16      sklearn_fork       5    10.6910    -3.3377  0.1617           0
                             18000000 dense   1       sklearn_fork       5    88.1266    -3.3665  0.1609           0
                                              2       sklearn_fork       5    61.3444    -3.3665  0.1608           0
                                              4       sklearn_fork       5    48.0059    -3.3665  0.1608           0
                                              8       sklearn_fork       5    41.3488    -3.3665  0.1608           0
                                              16      sklearn_fork       5    41.2265    -3.3665  0.1611           0
                                      sparse  1       sklearn_fork       5   265.7090    -3.3665  0.1608           0
                                              2       sklearn_fork       5   173.5621    -3.3665  0.1607           0
                                              4       sklearn_fork       5   124.4030    -3.3665  0.1607           0
                                              8       sklearn_fork       5   106.1407    -3.3665  0.1606           0
                                              16      sklearn_fork       5   192.5957    -3.3665   0.161           0
real_insurance_net_poisson   1000000  dense   1       sklearn_fork      10     9.7671    -3.3532  0.1612           0
                                              2       sklearn_fork      10     6.9492    -3.3532  0.1609           0
                                              4       sklearn_fork      10     5.5663    -3.3532  0.1614           0
                                              8       sklearn_fork      10     4.7798    -3.3532  0.1606           0
                                              16      sklearn_fork      10     4.8909    -3.3532  0.1603           0
                                      sparse  1       sklearn_fork      10    26.5831    -3.3532  0.1635           0
                                              2       sklearn_fork      10    16.5875    -3.3532  0.1629           0
                                              4       sklearn_fork      10    11.3910    -3.3532   0.162           0
                                              8       sklearn_fork      10     9.4460    -3.3532  0.1612           0
                                              16      sklearn_fork      10    19.7150    -3.3532  0.1608           0
                             18000000 dense   1       sklearn_fork      10   175.4268    -3.3576  0.1618           0
                                              2       sklearn_fork      10   121.9683    -3.3576  0.1618           0
                                              4       sklearn_fork      10    95.6292    -3.3576  0.1617           0
                                              8       sklearn_fork      10    82.3669    -3.3576  0.1617           0
                                              16      sklearn_fork      10    82.6210    -3.3576  0.1618           0
                                      sparse  1       sklearn_fork      10   511.6008    -3.3576  0.1618           0
                                              2       sklearn_fork      10   324.6799    -3.3576  0.1616           0
                                              4       sklearn_fork      10   227.5025    -3.3576  0.1615           0
                                              8       sklearn_fork      10   190.4679    -3.3576  0.1616           0
                                              16      sklearn_fork      10   359.1789    -3.3576  0.1617           0
wide_insurance_l2_poisson    1000000  dense   1       sklearn_fork      10    49.0238    -2.0280  0.1422           0
                                              2       sklearn_fork      10    30.6427    -2.0280  0.1422           0
                                              4       sklearn_fork      10    21.8896    -2.0280  0.1422           0
                                              8       sklearn_fork      10    17.2667    -2.0280  0.1422           0
                                              16      sklearn_fork      10    17.4586    -2.0280  0.1422           0
                                      sparse  1       sklearn_fork      10    11.1084    -2.0280  0.1422           0
                                              2       sklearn_fork      10     8.6160    -2.0280  0.1422           0
                                              4       sklearn_fork      10     7.4407    -2.0280  0.1422           0
                                              8       sklearn_fork      10     7.2905    -2.0280  0.1422           0
                                              16      sklearn_fork      10     7.5111    -2.0280  0.1422           0
                             18000000 dense   1       sklearn_fork      13  1171.3334    -2.1096  0.1403           0
                                              2       sklearn_fork      29  1546.3203    -2.1096  0.1403           0
                                              4       sklearn_fork      10   405.3979    -2.1096  0.1403           0
                                              8       sklearn_fork      10   322.3633    -2.1096  0.1403           0
                                              16      sklearn_fork      10   320.1045    -2.1096  0.1403           0
                                      sparse  1       sklearn_fork      10   241.5324    -2.1096  0.1403           0
                                              2       sklearn_fork      20   352.8254    -2.1096  0.1403           0
                                              4       sklearn_fork      20   307.4050    -2.1096  0.1403           0
                                              8       sklearn_fork      16   248.4476    -2.1096  0.1403           0
                                              16      sklearn_fork      16   218.3485    -2.1096  0.1403           0
wide_insurance_net_poisson   1000000  dense   1       sklearn_fork      13    66.1757    -2.2057  0.1426           0
                                              2       sklearn_fork      13    42.3533    -2.2057  0.1426           0
                                              4       sklearn_fork      13    30.3914    -2.2057  0.1426           0
                                              8       sklearn_fork      13    24.3199    -2.2057  0.1426           0
                                              16      sklearn_fork      13    24.6798    -2.2057  0.1426           0
                                      sparse  1       sklearn_fork      13    16.0367    -2.2057  0.1426           0
                                              2       sklearn_fork      13    12.3721    -2.2057  0.1426           0
                                              4       sklearn_fork      13    10.9711    -2.2057  0.1426           0
                                              8       sklearn_fork      13    10.9589    -2.2057  0.1426           0
                                              16      sklearn_fork      13    11.0781    -2.2057  0.1426           0
                             18000000 dense   1       sklearn_fork      15  1416.9511    -2.3355  0.1402           0
                                              2       sklearn_fork      15   895.1590    -2.3355  0.1402           0
                                              4       sklearn_fork      15   622.3026    -2.3355  0.1402           0
                                              8       sklearn_fork      15   501.4581    -2.3355  0.1402           0
                                              16      sklearn_fork      15   491.9906    -2.3355  0.1402           0
                                      sparse  1       sklearn_fork      15   378.4437    -2.3355  0.1402           0
                                              2       sklearn_fork      15   297.5474    -2.3355  0.1402           0
                                              4       sklearn_fork      15   261.0684    -2.3355  0.1402           0
                                              8       sklearn_fork      15   256.5559    -2.3355  0.1402           0
                                              16      sklearn_fork      15   225.0999    -2.3355  0.1402           0

Results as CSV (zipped)

this week's work

opened by jtilly 20

[Critical] Improve performance for sparse matrices

Reported by @jtilly "I'm having some difficulties getting good results using real data (on 1 million rows for now). The script and corresponding log file that I'm using in our infrastructure are here: https://gist.github.com/jtilly/d2ff9b7bd6c690a35db052d1730e0a06 I'm comparing the sklearn_fork vs glmnet_python. Performance doesn't look that great for the sklearn_fork implementations once we add l1 penalties or make things sparse. I'm also having difficulties aligning coefficients (and predictions) between the sklearn fork and glmnet. I'm not sure what the best way to debugging the problem is. I also integrated the real data set into the glm_benchmarks package (see https://github.com/Quantco/glm_benchmarks/pull/47). If you get the chance, could you take a quick look at what I implemented so make sure I didn't screw things up anywhere? Also, any insights why glmnet and sklearn results don't align? I'm also not an export user of the glm_benchmarks package (yet), so if you have any ideas how to pull debug information out of it, please let me know"
this week's work

opened by ElizabethSantorellaQC 15
Fuse alpha and alphas parameters

I was wondering if it'd make sense to skip this check if alpha_search is active, since we don't use alpha in that case (ref).

Also, we could simplify isinstance(self.alpha, float) or isinstance(self.alpha, int) to isinstance(self.alpha, (float, int)). :)
code quality

opened by lbittarello 12
Fuse alpha and alphas parameters
Checklist

[x] Added a CHANGELOG.rst entry

alphas is now deprecated in GeneralizedLinearRegressor (not in the CV version).

Instead, alpha and search_alpha are used to automatically detect the intent.

This fixes issue #335.
opened by MarcAntoineSchmidtQC 11
Adding a script to produce a nice benchmark figure against h2o/glmnet
This PR adds a tool that produces benchmark figures for all four datasets, both regularization types and five distributions.

For example, this figure shows performance for a lasso penalty on the intermediate-insurance dataset:

Checklist

[x] I don't think a changelog entry is needed because this is not user visible.

[x] Fixed up the r-glmnet benchmark: - enabled binomial distribution. I think this was failing on a previous version of glmnet but it's works now! - enabled sparse matrices. This improves performance for most problems.

[x] wrote docs/benchmarks/benchmark_figure.py which produces the figures and also stores the docs/benchmarks/benchmark_data.csv file with benchmark results.

[x] output the figures to docs/_static which is included in the repo. It might seem bad to include pdfs/pngs in the repo, but these are hard to regenerate and are going to be included in the documentation pages.
opened by tbenthompson 10
[Major] Add a class for efficient operations on categorical features (sparse/dense split done)

One-hot encoding categorical variables generates matrices where all nonzero elements are 1, and there is only one nonzero element per row. It is possible to store these matrices with much less memory than a general sparse matrix and to operate on them more efficiently. We could improve performance a lot by adding a class that represents our data as a partitioned matrix composed of several one-hot encoded matrices (and perhaps also a dense block).
this week's work performance

opened by ElizabethSantorellaQC 10

Add support for linear inequality constraints

Closes #342 (or maybe not, we will see). Closes #344 (temporary adjustments to benchmark code).

Tasks for an MVP

[x] Add a new solver based on scipy.optimize.minimize(method='trust-constr') (see here under "Notes")
[x] Make this solver available in the GLM and ensure it produces correct results when optimising without constraints
- [x] Via pytest cases (I test it parallel to lbfgs)
- [x] Via benchmark problems (manually for some selected cases at this point)
[x] Add parameters to pass linear inequality constraints to the GLM API (A_ineq θ <= b_ineq)
[x] Support fitting with an intercept (i.e., extend A_ineq and b_ineq)
[x] Formulate dedicated test cases for the new solver & new type of constraints
- [x] Test equality of bounds and analogous inequality constraints
- [x] ~General tests for inequality constraints (due to a lack of benchmark software, construct simple test case with a clear optimal solution)~ I suggest to skip this part, as the underlying algorithm does not differentiate between "real" inequality constraints and "quasi" bound constraints, as long as we pass them in the form A theta <= b. The latter case we test already, and do so against a trustworthy benchmark.
[ ] Analyze convergence behavior
- [x] Poisson family, narrow_insurance_dataset
- [ ] (possibly other combinations)

Tasks for a productive version

[x] Support inequality constraints in the CV GLM
[x] Extend Docstrings for new functionality, incl. warning about runtime
[x] Various safety checks when inequality constraints are present
- [x] Only allow either bounds or inequality constraints
- [x] ~Handle case of initially infeasible starting point under inequality constraints~ (I verified this is not an issue for trust-constr)
- [x] Analogy to check_bounds
[x] refactor and share _get_obj_and_derivative between the gradient descent solvers
[x] Add a CHANGELOG.rst entry

Example

import numpy as np
import pandas as pd
import plotnine as pn

from quantcore.glm_benchmarks.problems import (
    load_data,
    generate_narrow_insurance_dataset,
)
from quantcore.glm import GeneralizedLinearRegressor

# Load parts of the French Motor Insurance dataset
dat = load_data(generate_narrow_insurance_dataset)
dat["X"] = dat["X"].loc[:, lambda x: x.columns.str.startswith("DrivAge")]
X, y = dat["X"], dat["y"]

kwargs_shared = {
    "family": "poisson",
    "l1_ratio": 0,
    "alpha": 0,
    "fit_intercept": False,
}

# Define constraints (manual for now, convenience function tbd)
A_ineq = np.zeros(shape=(2, X.shape[1]))
b_ineq = np.zeros(shape=(2))

# Bound constraint on DrivAge_0 <= -0.80
A_ineq[0, X.columns == "DrivAge_0"] = 1
b_ineq[0] = -0.80

# Inequality constraint to ensure DrivAge_5 <= DrivAge_6
A_ineq[1, X.columns == "DrivAge_5"] = 1
A_ineq[1, X.columns == "DrivAge_6"] = -1

# Fit models and plot coefficients

mdls = {
    "auto": GeneralizedLinearRegressor(**kwargs_shared),
    "lbfgs": GeneralizedLinearRegressor(solver="lbfgs", **kwargs_shared),
    "trust-(un)constr": GeneralizedLinearRegressor(
        solver="trust-constr",
        **kwargs_shared,
    ),
    "trust-constr": GeneralizedLinearRegressor(
        solver="trust-constr",
        A_ineq=A_ineq,
        b_ineq=b_ineq,
        **kwargs_shared,
    ),
}

coefs = []
for name, mdl in mdls.items():
    mdl.fit(X=X, y=y)
    coefs.append(pd.DataFrame(dict(name=X.columns, coef=mdl.coef_)).assign(model=name))
    print(
        f"model {name}: mean(y): {np.mean(y):.6f}, "
        f"mean(pred): {np.mean(mdl.predict(X=X)):.6f}"
    )

df_coefs = pd.concat(coefs)

(
    pn.ggplot(df_coefs, pn.aes(x="name", y="coef", color="model"))
    + pn.geom_point(position=pn.positions.position_jitter(width=0.15))
    + pn.geom_line(pn.aes(group="model"))
    + pn.theme_minimal()
    + pn.labs(
        x="factor",
        y="estimate",
        color="solver",
        title="Note: jitter only added horizontally",
    )
)

opened by PhilippRuchser 9

[Minor] In the objective function, drop terms that are not dependent on the parameters.
[EDIT] See the conversation below.

Currently, in _eta_mu_deviance, we compute the deviance and then later multiply by 0.5 and add L1 and L2 penalty terms to compute an objective function value. This isn't actually strictly speaking the objective function value, but it should differ only by a constant dependent on y. Computing the deviance is more complicated for most distribution/link function pairs than computing the log-likelihood. For example, for Poisson, the LL is:

y[i] * eta[i] - mu[i]

whereas the deviance as currently implemented is:

if y[i] == 0: unit_deviance = 2 * (-y[i] + mu_out[i]) else: unit_deviance = 2 * ((y[i] * (log(y[i]) - eta_out[i] - 1)) + mu_out[i])

Since we don't actually need a deviance, we should compute the log-likelihood.
opened by tbenthompson 9
Different libraries may deal with constants in the log-likelihood differently
Log-likelihoods sometimes have some ugly constants, like pi for the normal distribution or a factorial for Poisson. Libraries may reasonably choose to omit these constants. This presents a problem if they make different decisions about how to treat such constants, since omitting a constant will change the strength of the regularization, leading to different optimal solutions.

Two possible approaches:

Suggested by @tbenthompson: Fit cross-validated models along a regularization path, and report the one with the lowest cross-validated error. This will be slow, but sidesteps the issue.

Figure out how the libraries are dealing with constants and correct for it.

question
opened by ElizabethSantorellaQC 9
New feature: add information criteria for model diagnostics
Checklist:

[x] Added a CHANGELOG.rst entry

[x] Decision on what to do with ridge/elastic net regularisation. See here for more details.

[x] Decision on what to do with CV implementation. Decision: information criteria are primarily a metric for estimating out of sample generalisation from in sample measures. If cross validation is applicable to the setting (i.e., there is enough data) it does not seem to make sense to use these metrics so we have not supplied them on the GeneralizedLinearRegressorCV class.

Closes #516.

Summary: This change implements the calculations of the aic, aicc and bic information criteria for the trained model. I placed these criteria as properties/attributes on the glum.GeneralizedLinearRegressor class.

Notes:

These information criteria require an "effective number of parameters" of the model. In the case of an unregularized model, this is simply the count of features/parameters. In the case of a lasso model, this is the count of the non-zero parameters on the ML fit. In the case of a ridge/elastic net model, I could not find a consensus on how this should be implemented so I opted to add a warning here. Does anyone have a better suggestion or is there something that I have missed? Decision: I have answered this in more detail here but I will argue that it does not make sense to compute these values for models that use L2 regularisation. I think keeping the warning there when L2 regularisation is used is sufficient to highlight this issue.

The method to compute these scores is always called from the fit method of glum.GeneralizedLinearRegressor. This is because we require the X and y data sources to compute these values. This allows for a simple interface where the score is defined as an attribute of the trained regressor, e.g: regressor.aic, regressor.aicc or regressor.bic. This could rather be done as a separate method call that accepts X and y as arguments: regressor.aic(X, y), regressor.aicc(X, y) or regressor.bic(X, y). I opted for the first choice as (1) I think this is neater and (2) it makes sense to compute these values at train time; but I am happy to change to the alternative. Decision: We have decided not to compute these statistics at train time due to the unnecessary computational overhead on the fit method. Rather, passing the dataset at call-time seems preferable.

These information criteria are only available when the noise model is one of BinomialDistribution, GammaDistribution, NormalDistribution, PoissonDistribution. See line 1640. This is because we require the definition of the likelihood function. Maybe I missed them but are there other families that we should include here? Decision: include the TweedieDistribution that generalises the above.

Calling aic, aicc or bic on the model before it has been trained returns None. We could rather log a warning or throw an exception. I am not sure if there is a standard/preferred approach here?
Decision: raise error.
opened by NicholasHoernleQC 8

Illegal instruction / segfault in `glm.fit()` from the Getting Started example

Hi!

The example from the Getting Started page generates an "Illegal instruction" segfault with Python 3.9 on Linux:

$ python 3.9
Python 3.9.0 (default, Oct  6 2020, 11:01:41)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import sklearn
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor, GeneralizedLinearRegressorCV
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>> X = house_data.data[
...     [
...         "bedrooms",
...         "bathrooms",
...         "sqft_living",
...         "floors",
...         "waterfront",
...         "view",
...         "condition",
...         "grade",
...         "yr_built",
...     ]
... ].copy()
>>> y = house_data.target
>>> X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
...     X, y, test_size = 0.3, random_state=5
... )
>>> glm = GeneralizedLinearRegressor(family="normal", alpha=0.1, l1_ratio=1)
>>> glm.fit(X_train, y_train)
Illegal instruction (core dumped)

The error is here:

Program terminated with signal 4, Illegal instruction.
#0  0x00007f8d4dd2e973 in dense_baseTrue<double> ([email protected]=0x7f8d4ce00ac0, L=0x7f8d4d809340, [email protected]=0x3f031d0, [email protected]=9, imin2=0, imax2=9, jmin2=0,
    jmax2=9, kmin=0, kmax=512, innerblock=128, kstep=512, d=<optimized out>) at src/tabmat/ext/dense_helpers.cpp:73

This is from version 2.1.0 installed with pip, using the PyPi wheels:

Collecting glum
  Downloading glum-2.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 44.8 MB/s eta 0:00:00
Collecting pandas
  Downloading pandas-1.4.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.7/11.7 MB 92.4 MB/s eta 0:00:00
Collecting joblib
  Downloading joblib-1.1.0-py2.py3-none-any.whl (306 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.0/307.0 kB 106.2 MB/s eta 0:00:00
Collecting scipy
  Downloading scipy-1.8.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.2/42.2 MB 141.2 MB/s eta 0:00:00
Collecting tabmat>=3.0.1
  Downloading tabmat-3.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.8/5.8 MB 75.3 MB/s eta 0:00:00
Collecting numpy
  Downloading numpy-1.23.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.1/17.1 MB 94.6 MB/s eta 0:00:00
Collecting numexpr
  Downloading numexpr-2.8.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (380 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 380.5/380.5 kB 86.3 MB/s eta 0:00:00
Collecting scikit-learn>=0.23
  Downloading scikit_learn-1.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (30.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30.8/30.8 MB 145.3 MB/s eta 0:00:00
Collecting threadpoolctl>=2.0.0
  Downloading threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Collecting packaging
  Downloading packaging-21.3-py3-none-any.whl (40 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.8/40.8 kB 86.0 MB/s eta 0:00:00
Collecting pytz>=2020.1
  Downloading pytz-2022.1-py2.py3-none-any.whl (503 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 503.5/503.5 kB 87.0 MB/s eta 0:00:00
Collecting python-dateutil>=2.8.1
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 kB 77.3 MB/s eta 0:00:00
Requirement already satisfied: six>=1.5 in /share/software/user/open/python/3.9.0/lib/python3.9/site-packages (from python-dateutil>=2.8.1->pandas->glum) (1.16.0)
Collecting pyparsing!=3.0.5,>=2.0.2
  Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.3/98.3 kB 72.3 MB/s eta 0:00:00
Installing collected packages: pytz, threadpoolctl, python-dateutil, pyparsing, numpy, joblib, scipy, pandas, packaging, tabmat, scikit-learn, numexpr, glum
Successfully installed glum-2.1.0 joblib-1.1.0 numexpr-2.8.3 numpy-1.23.0 packaging-21.3 pandas-1.4.3 pyparsing-3.0.9 python-dateutil-2.8.2 pytz-2022.1 scikit-learn-1.1.1 scipy-1.8.1 tabmat-3.1.0 threadpoolctl-3.1.0

opened by kcgthb 7

Docs for `P1` are a bit unclear

In the API reference for glum.GeneralizedLinearRegressor: https://glum.readthedocs.io/en/latest/glm.html#glum.GeneralizedLinearRegressor

It says about P2:

With this option, you can set the P2 matrix in the L2 penalty w * P2 * w. This gives a fine control over this penalty (Tikhonov regularization). A 2d array is directly used as the square matrix P2.

But for P1:

With this array, you can exclude coefficients from the L1 penalty. Set the corresponding value to 1 (include) or 0 (exclude).

The latter one gives the impression that P1 is only for inclusion/exclusion instead of also being usable as a per-feature multiplier.

opened by david-cortes 1
Interactions

Fantastic project.

I would love to see the possibility to add interactions on the fly, just like H20. There, you can provide a list of interaction pairs or, alternatively, a list of columns with pairwise interactions.

This would be especially useful as scikit-learn preprocessing does not allow to create dummy encodings for categorical X and then calculate their product with another feature. (At least not with neat code.)

opened by mayer79 0
Implement distributional anchor regression in glum
I am interested in domain generalization (DG, also "external validity) of statistical / machine learning models. Anchor regression [1] is a recent idea interpolating between OLS and IV. [3] give ideas to generalize anchor regression to more general distributions (including classification). [2] is a "nice-to-read" summary, including ideas on how to extend to non-linear settings.

To my knowledge, no efficient implementations for anchor regression or classification exist. I'd be interested to contribute this to my favorite GLM library but would need some guidance.

What is Anchor Regression?

Anchor regression improves the DG / external validity of OLS by adding a regularization term penalizing the correlation between a so-called anchor variable and the regression's residuals. The anchor variable is assumed to be exogenous to the system, i.e., not directly causally affected by covariates, the outcome, or relevant hidden variables. See the following causal graph:

graph LR A --> U & X & Y U --> X & Y X --> Y

What is an anchor?: Say we are interested to predict health outcomes in the ICU. Possibly valid anchor variables would be hospital id (one-hot encoded) or some transformation of time of year. The choice of anchor depends on the application. If we would like to predict out of time but on the same hospitals as seen in training, using time of year as anchor suffices. The hospital id should be included in the covariates (X). If we however would like to generalize across hospitals (i.e., predict on unseen hospitals), we need to include hospital id as an anchor (and exclude it from covariates). A similar example would be insurance with geographical location and time of year.

Write $P_A$ for the $\ell_2$-projection onto the column-space of $A$ (i.e., $P_A(\cdot) = \mathbb{E}[\cdot \mid A]$) and let $\gamma>0$. In a regression setting, the anchor regression solution is given by:

$$ b^\gamma = \underset{b}{\arg\min} \mathbb{E}\textrm{train}[((\mathrm{Id} - P_A)(Y - X^T b))^2] + \gamma \mathbb{E}\textrm{train}[(P_A(Y - X^T b))^2]. $$

Given samples from $P_\mathrm{train}$, write $\Pi_A$ for the projection onto the column space of $A$, this can be estimated as

$$ \hat b^\gamma = \underset{b}{\arg\min} |((\mathrm{Id} - \Pi_A)(Y - X^T b))|_2^2 + \gamma | \Pi_A (Y - X^T b)|_2^2. $$

[1] show that the anchor regression solution protects against the worst-case risk with respect to distribution shifts induced through the anchor variable. Here $\gamma$ controls the size of the set of distributions the method protects against, which is generated by $\sqrt{\gamma}$-times the shifts as seen in the training data [1, Theorem 1].

In an instrumental variable (IV) setting (no direct causal effect $A \to U$, $A \to Y$, "sufficient" effect $A \to X$), anchor regression interpolates between OLS and IV regression, with $\hat b^\gamma$ converging to the IV solution for $\gamma \to \infty$. This is because the IV solution can be written as

$$ \hat b^\textrm{IV} = \underset{b \colon \mathrm{Cor}(A, X^T b - Y)=0}{\arg\min} |Y - X^T b|_2^2. $$

In low-dimensional settings, (1) can be optimized using the transformation

$$ \tilde X := (\mathrm{Id} - \Pi_A) X + \sqrt{\gamma} \Pi_A X \ \ \textrm{ and }\ \ \tilde Y := (\mathrm{Id} - \Pi_A)Y + \sqrt{\gamma} \Pi_A Y, $$

where $\Pi_A = A (A^T A)^{-1} A^T$ (this needs not to be calculated though).

What is Distributional Anchor Regression?

[2] present ideas on how to generalize anchor regression from OLS to GLMs. In particular, if $f$ are raw scores, they propose to use residuals

$$ r = \frac{d}{d f} \ell(f, y). $$

For $f = X^T \beta$ and $\ell(f, y) = \frac{1}{2}(y - f)^2$ this reduces to anchor regression. For logistic regression, with $Y \in {-1, 1}$ and

$$ \ell(f, y) = - \sum_i \log(1 + \exp(-y_i f_i)), $$

this yields residuals

$$ r = \frac{d}{d f} \ell(f, y) = y (1 + \exp(y_i f_i))^{-1} = \tilde y - p_i, $$

where $\tilde y = \frac{y}{2} + 0.5 \in {0, 1}$ and $p_i = (1 + \exp(-f_i))^{-1}$.

Define $\ell^\gamma(y, f) := \ell(f, y) + (\gamma - 1) | \Pi_A r |_2^2$. The gradient of the anchor loss is given as

$$ \frac{d}{d f_i} \ell^\gamma(f, y) = y_i (1 + \exp(y_i f_i))^{-1} - 2 (\gamma - 1) (\Pi_A r)_i p_i (1 - p_i). $$

The Hessian is (not pretty)

$$ \frac{d}{d f_i f_j} \ell^\gamma(f, y) = -\mathbb{1}_{{i = j}} p_i ( 1 - p_i) \left(1 + 2(\gamma - 1) (1 - 2p_i) (\Pi_A r)i \right) + 2 (\gamma - 1) p_i (1 - p_i) p_j (1 - p_j) (\Pi_A){i, j} $$

If $f = X^T \beta$, then (here, $\cdot$ is matrix multiplication)

$$ \frac{d}{d \beta} \ell^\gamma(X^T\beta, y) = y(1 + \exp(yf)^{-1}) \cdot X + 2(\gamma - 1) p (1 - p) \cdot \Pi_A X $$

and

$$ \frac{d}{d^2 \beta} \ell^\gamma(X^T\beta, y) = X^T \cdot \textrm{diag}(p (1 - p) (1 + 2(\gamma - 1)(1 - 2p)\Pi_A r)) X + X^T \cdot \mathrm{diag}(p (1-p)) \cdot \Pi_A \cdot \mathrm{diag}(p (1-p))\cdot X $$

Computational considerations

Here is some numpy code calculating and testing the above derivatives:

import numpy as np import pytest from scipy.optimize import approx_fprime def predictions(f): return 1 / (1 + np.exp(-f)) def proj(A, f): return np.dot(A, np.linalg.lstsq(A, f, rcond=None)[0]) def proj_matrix(A): return np.dot(np.dot(A, np.linalg.inv(A.T @ A)), A.T) def loss(X, beta, y, A, gamma): f = X @ beta r = (y / 2 + 0.5) - predictions(f) return -np.sum(np.log1p(np.exp(-y * f))) + (gamma - 1) * np.sum(proj(A, r) ** 2) def grad(X, beta, y, A, gamma): f = X @ beta p = predictions(f) r = (y / 2 + 0.5) - p return (r - 2 * (gamma - 1) * proj(A, r) * p * (1 - p)) @ X def hess(X, beta, y, A, gamma): f = X @ beta p = predictions(f) r = (y / 2 + 0.5) - p diag = -np.diag(p * (1 - p) * (1 + 2 * (gamma - 1) * (1 - 2 * p) * proj(A, r))) dense = proj_matrix(A) * p * (1 - p)[np.newaxis, :] * (p * (1 - p))[:, np.newaxis] return X.T @ (diag + 2 * (gamma - 1) * dense) @ X @pytest.mark.parametrize("gamma", [0, 0.1, 0.8, 1, 5]) def test_grad_hess(gamma): rng = np.random.default_rng(0) n = 100 p = 10 q = 3 X = rng.normal(size=(n, p)) beta = rng.normal(size=p) y = 2 * rng.binomial(1, 0.5, n) - 1 A = rng.normal(size=(n, q)) approx_grad = approx_fprime(beta, lambda b: loss(X, b, y, A, gamma)) np.testing.assert_allclose(approx_grad, grad(X, beta, y, A, gamma), 1e-5) approx_hess = approx_fprime(beta, lambda b: grad(X, b, y, A, gamma), 1e-7) np.testing.assert_allclose(approx_hess, hess(X, beta, y, A, gamma), 1e-5)

I understand that glum implements different solvers. As $\ell_1$-regularization is popular in the robustness community, the irls solver is most interesting.

To my understanding, the computation of the full projection matrix above can be skipped using a QR decomposition of $A$. However, in your implementation, you never actually compute the Hessian, but rather an approximation. And your implementation appears to depend heavily on the Hessian being of the form $X^T D X$ for some diagonal $D$, which is no longer the case here.

Summary

Anchor regression interpolates between OLS and IV regression to improve the models' robustness to distribution shifts. Distributional anchor regression is a generalization to GLMs. To my knowledge, no efficient solver for distributional anchor regression exists.

Is this something you would be interested to integrate into glum? How complex would this be? Are there any hurdles (e.g., dense Hessian) that prohibit the use of existing methods?

References

[1] Rothenhäusler, D., N. Meinshausen, P. Bühlmann, and J. Peters (2021). Anchor regression: Heterogeneous data meet causality. Journal of the Royal Statistical Society Series B (Statistical Methodology) 83(2), 215–246.

[2] Bühlmann, P. (2020). Invariance, causality and robustness. Statistical Science 35(3), 404– 426.

[3] Kook, L., B. Sick, and P. Bühlmann (2022). Distributional anchor regression. Statistics and Computing 32(3), 1–19.
opened by mlondschien 0
Support for quasibinomial, quasipoisson, negative binomial, multinomial & Dirichlet multinomial families?

Are there any plans to also support additional GLM families like quasibinomial, quasipoisson, negative binomial, multinomial, Dirichlet-multinomial (overdispersed multinomial) & ordinal GLMs, some of which are now supported by e.g. h20? (Multinomial & Dirichlet multinomial I believe can be recast into a poisson or quasipoisson GLM via the Poisson trick, but that's computationally not efficient)

opened by tomwenseleers 1
Request for a force_finite flag for score function

The r2_score method in sklearn has a force_finite flag which defaults to True in order to avoid infinite and NaN values when the TSS happens to be 0. The analogous quantity when computing D^2 is the null deviance, which can also sometimes be 0. It would be great if, in glum, there was also a force_finite flag that can gracefully handle the case where the null deviance happens to be 0. Right now, I get a ZeroDivisionError in glum 2.1.2 running in Python 3.6.
new feature

opened by thobanster 0

Releases(2.2.1)

2.2.1(Nov 25, 2022)

Fixing the 2.2.0 failed release.
Source code(tar.gz)
Source code(zip)
2.2.0(Nov 25, 2022)
2.2.0 - 2022-11-25

New features:

Add an argument to GeneralizedLinearRegressorBase to drop the first category in a Categorical column using implementation in tabmat

One may now request the Tweedie loss by setting the 'family' parameter of GeneralizedLinearRegressor and GeneralizedLinearRegressorCV to 'tweedie'.

Bug fixes:

Setting bounds for constant columns was not working (bounds were internally modified to 0). A similar issue was preventing inequalities from working with constant columns. This is now fixed.

Other changes:

No more builds for 32-bit systems with Python >= 3.8. This is due to scipy not supporting it anymore.

Source code(tar.gz)
Source code(zip)
2.1.2(Jul 1, 2022)
2.1.2 - 2022-07-01

Other changes:

Next attempt to build wheel for PyPI without --march=native.

Source code(tar.gz)
Source code(zip)
2.1.1(Jul 1, 2022)
2.1.1 - 2022-07-01

Other:

We are now building the wheel for PyPI without --march=native to make it more portable across architectures.

Source code(tar.gz)
Source code(zip)
2.1.0(Jun 27, 2022)
2.1.0 - 2022-06-27

New features:

Added aic, aicc and bic attributes to GeneralizedLinearRegressor. These attributes provide the information criteria based on the training data and the effective degrees of freedom of the maximum likelihood estimate for the model's parameters.

GeneralizedLinearRegressor.std_errors and GeneralizedLinearRegressor.covariance_matrix now accept data frames with categorical data.

Bug fixes:

The score method of GeneralizedLinearRegressor and GeneralizedLinearRegressorCV now accepts offsets.

Fixed the calculation of the information matrix for the Binomial distribution with logit link, which affected non-robust standard errors.

Other:

The CI now runs daily unit tests against the nightly builds of numpy, pandas and scikit-learn.

The minimally required version of tabmat is now 3.1.0.

Source code(tar.gz)
Source code(zip)
2.0.3(Nov 5, 2021)
2.0.3 - 2021-11-05

Other:

We are now specifying the run time dependencies in setup.py, so that missing dependencies are automatically installed from PyPI when installing glum via pip.

Source code(tar.gz)
Source code(zip)
2.0.2(Nov 3, 2021)
Bug fix:

Fixed the sign of the log likelihood of the Gaussian distribution (not used for fitting coefficients).

Fixed the wide benchmarks which had duplicated columns (categorical and numerical).

Other:

The CI now builds the wheels and upload to pypi with every new release.

Renamed functions checking for qc.matrix compliance to refer to tabmat.

Source code(tar.gz)
Source code(zip)
2.0.1(Oct 11, 2021)
2.0.1 - 2021-10-11

Bug fix:

Fixed pyproject.toml. We now support installing through pip and pep517.

Source code(tar.gz)
Source code(zip)
2.0.0(Oct 8, 2021)
Breaking changes:

Renamed the package to glum!!! Hurray! Celebration.

GeneralizedLinearRegressor and GeneralizedLinearRegressorCV lose the fit_dispersion parameter. Please use the dispersion method of the appropriate family instance instead.

All functions now use sample_weight as a keyword instead of weights, in line with scikit-learn.

All functions now use dispersion as a keyword instead of phi.

Several methods GeneralizedLinearRegressor and GeneralizedLinearRegressorCV that should have been private have had an underscore prefixed on their names: tear_down_from_fit, _set_up_for_fit, _set_up_and_check_fit_args, _get_start_coef, _solve and _solve_regularization_path.

glum.GeneralizedLinearRegressor.report_diagnostics and glum.GeneralizedLinearRegressor.get_formatted_diagnostics are now public.

New features:

P1 and P2 now accepts 1d array with the same number of elements as the unexpanded design matrix. In this case, the penalty associated with a categorical feature will be expanded to as many elements as there are levels, all with the same value.

ExponentialDispersionModel gains a dispersion method.

BinomialDistribution and TweedieDistribution gain a log_likelihood method.

The fit method of GeneralizedLinearRegressor and GeneralizedLinearRegressorCV now saves the column types of pandas data frames.

GeneralizedLinearRegressor and GeneralizedLinearRegressorCV gain two properties: family_instance and link_instance.

GeneralizedLinearRegressor.std_errors and GeneralizedLinearRegressor.covariance_matrix have been added and support non-robust, robust (HC-1), and clustered covariance matrices.

GeneralizedLinearRegressor and GeneralizedLinearRegressorCV now accept family='gaussian' as an alternative to family='normal'.

Bug fix:

The score method of GeneralizedLinearRegressor and GeneralizedLinearRegressorCV now accepts data frames.

Upgraded the code to use tabmat 3.0.0.

Other:

A major overhaul of the documentation. Everything is better!

The methods of the link classes will now return scalars when given scalar inputs. Under certain circumstances, they'd return zero-dimensional arrays.

There is a new benchmark available glm_benchmarks_run based on the Boston housing dataset. See here.

glm_benchmarks_analyze now includes offset in the index. See here.

glmnet_python was removed from the benchmarks suite.

The innermost coordinate descent was optimized. This speeds up coordinate descent dominated problems like LASSO by about 1.5-2x. See here.

Source code(tar.gz)
Source code(zip)
1.5.1(Jul 22, 2021)
1.5.1 - 2021-07-22

Bug fix:

Have the linear_predictor and predict methods of GeneralizedLinearRegressor and GeneralizedLinearRegressorCV honor the offset when alpha is None.

Source code(tar.gz)
Source code(zip)
1.5.0(Jul 15, 2021)
1.5.0 - 2021-07-15

New features:

The linear_predictor and predict methods of quantcore.glm.GeneralizedLinearRegressor and quantcore.glm.GeneralizedLinearRegressorCV gain an alpha parameter (in complement to alpha_index). Moreover, they are now able to predict for multiple penalties.

Other:

Methods of Link now consistently return NumPy arrays, whereas they used to preserve pandas series in special cases.

Don't list sparse_dot_mkl as a runtime requirement from the conda recipe.

The minimal NumPy pin should be dependent on the NumPy version in host and not fixed to 1.16.

Source code(tar.gz)
Source code(zip)
1.4.3(Jun 25, 2021)
1.4.3 - 2021-06-25

Bug fix:

copy_X = False will now raise a value error when X has dtype int32 or int64. Previously, it would only raise for dtype int64.

Source code(tar.gz)
Source code(zip)
1.4.2(Jun 15, 2021)
1.4.2 - 2021-06-15

Tutorials and documenation improvements:

Adding tutorials to the documentation

Additional documentation improvements

Bug fix:

Verbose progress bar now working again.

Other:

Small improvement in documentation for the alpha_index argument to :func:quantcore.glm.GeneralizedLinearRegressor.predict.

Pinned pre-commit hooks versions.

Source code(tar.gz)
Source code(zip)
1.4.1(May 1, 2021)

1.4.1 - 2021-05-01

We now have Windows builds 🚀
Source code(tar.gz)
Source code(zip)
1.4.0(Apr 13, 2021)
1.4.0 - 2021-04-13

Deprecations:

Fusing the alpha and alphas arguments for quantcore.glm.GeneralizedLinearRegressor. alpha now also accepts array-like inputs. alphas is now deprecated but can still be used for backward compatibility. The alphas argument will be removed with the next major version.

Other:

We removed entry points to functions in quantcore.glm_benchmarks from the conda package.

Source code(tar.gz)
Source code(zip)
1.3.1(Apr 13, 2021)
1.3.1 - 2021-04-12

Bug fix:

quantcore.glm._distribution.unit_variance_derivative is evaluating a proper numexpr expression again (regression in 1.3.0).

Source code(tar.gz)
Source code(zip)
1.3.0(Apr 13, 2021)
1.3.0 - 2021-04-12

New features:

We added a new solver based on scipy.optimize.minimize(method='trust-constr').

We added support for linear inequality constraints of type A_ineq.dot(coef_) <= b_ineq.

Source code(tar.gz)
Source code(zip)
1.2.0(Feb 4, 2021)

1.2.0 - 2021-02-04

We removed quantcore.glm_benchmarks from the conda package.
Source code(tar.gz)
Source code(zip)
1.1.1(Jan 11, 2021)

1.1.1 - 2021-01-11

Maintenance release to get a fresh build for OSX.
Source code(tar.gz)
Source code(zip)
1.1.0(Nov 23, 2020)

1.1.0 - 2020-11-23

New features:

Direct support for pandas categorical types in fit and predict. These will be converted into a CategoricalMatrix.
Source code(tar.gz)
Source code(zip)
1.0.1(Nov 12, 2020)

This is a maintenance release to be compatible with quantcore.matrix>=1.0.0.
Source code(tar.gz)
Source code(zip)
1.0.0(Nov 11, 2020)
Breaking change:

Renamed alpha_level attribute of quantcore.glm.GeneralizedLinearRegressor and quantcore.glm.GeneralizedLinearRegressorCV to alpha_index.

Other:

Clarified behavior of scale_predictors.

Source code(tar.gz)
Source code(zip)
0.0.15(Nov 11, 2020)
Other

Pin quantcore.matrix < 1.0.0 as we are expecting a breaking change with version 1.0.0.

Source code(tar.gz)
Source code(zip)

High performance Python GLMs with all the features!

Related tags

Overview

glum

A classic example predicting housing prices

Installation

Comments

Results

What is Anchor Regression?

What is Distributional Anchor Regression?

Computational considerations

Summary

References

Releases(2.2.1)

2.2.1(Nov 25, 2022)

2.2.0(Nov 25, 2022)

2.2.0 - 2022-11-25

2.1.2(Jul 1, 2022)

2.1.2 - 2022-07-01

2.1.1(Jul 1, 2022)

2.1.1 - 2022-07-01

2.1.0(Jun 27, 2022)

2.1.0 - 2022-06-27

2.0.3(Nov 5, 2021)

2.0.3 - 2021-11-05

2.0.2(Nov 3, 2021)

2.0.1(Oct 11, 2021)

2.0.1 - 2021-10-11

2.0.0(Oct 8, 2021)

1.5.1(Jul 22, 2021)

1.5.1 - 2021-07-22

1.5.0(Jul 15, 2021)

1.5.0 - 2021-07-15

1.4.3(Jun 25, 2021)

1.4.3 - 2021-06-25

1.4.2(Jun 15, 2021)

1.4.2 - 2021-06-15

1.4.1(May 1, 2021)

1.4.1 - 2021-05-01

1.4.0(Apr 13, 2021)

1.4.0 - 2021-04-13

1.3.1(Apr 13, 2021)

1.3.1 - 2021-04-12

1.3.0(Apr 13, 2021)

1.3.0 - 2021-04-12

1.2.0(Feb 4, 2021)

1.2.0 - 2021-02-04

1.1.1(Jan 11, 2021)

1.1.1 - 2021-01-11

1.1.0(Nov 23, 2020)

1.1.0 - 2020-11-23

1.0.1(Nov 12, 2020)

1.0.0(Nov 11, 2020)

0.0.15(Nov 11, 2020)

Owner

QuantCo

Accelerating model creation and evaluation.

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models

My project contrasts K-Nearest Neighbors and Random Forrest Regressors on Real World data

Bayesian Additive Regression Trees For Python

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

QML: A Python Toolkit for Quantum Machine Learning

cuML - RAPIDS Machine Learning Library

A simple machine learning package to cluster keywords in higher-level groups.

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically

LinearRegression2 Tvads and CarSales

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Machine-learning-dell - Repositório com as atividades desenvolvidas no curso de Machine Learning

This is my implementation on the K-nearest neighbors algorithm from scratch using Python

Decision Tree Regression algorithm implemented on Python from scratch.

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas.

Provide an input CSV and a target field to predict, generate a model + code to run it.

A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and motion planning

Tools for mathematical optimization region

Python/Sage Tool for deriving Scattering Matrices for WDF R-Adaptors

Pyomo is an object-oriented algebraic modeling language in Python for structured optimization problems.