MBTR is a python package for multivariate boosted tree regressors trained in parameter space.

Last update: Dec 19, 2022

Related tags

Overview

Multivariate Boosted TRee

What is MBTR

MBTR is a python package for multivariate boosted tree regressors trained in parameter space. The package can handle arbitrary multivariate losses, as long as their gradient and Hessian are known. Gradient boosted trees are competition-winning, general-purpose, non-parametric regressors, which exploit sequential model fitting and gradient descent to minimize a specific loss function. The most popular implementations are tailored to univariate regression and classification tasks, precluding the possibility of capturing multivariate target cross-correlations and applying conditional penalties to the predictions. This package allows to arbitrarily regularize the predictions, so that properties like smoothness, consistency and functional relations can be enforced.

Installation

pip install --upgrade git+https://github.com/supsi-dacd-isaac/mbtr.git

Usage

MBT regressor follows the scikit-learn syntax for regressors. Creating a default instance and training it is as simple as:

m = MBT().fit(x,y)

while predictions for the test set are obtained through

y_hat = m.predict(x_te)

The most important parameters are the number of boosts n_boost, that is, the number of fitted trees, learning_rate and the loss_type. An extensive explanation of the different parameters can be found in the documentation.

Documentation

Documentation and examples on the usage can be found at docs.

Reference

If you make use of this software for your work, we would appreciate it if you would cite us:

Lorenzo Nespoli and Vasco Medici (2020). Multivariate Boosted Trees and Applications to Forecasting and Control arXiv

@article{nespoli2020multivariate,
  title={Multivariate Boosted Trees and Applications to Forecasting and Control},
  author={Nespoli, Lorenzo and Medici, Vasco},
  journal={arXiv preprint arXiv:2003.03835},
  year={2020}
}

Acknowledgments

The authors would like to thank the Swiss Federal Office of Energy (SFOE) and the Swiss Competence Center for Energy Research - Future Swiss Electrical Infrastructure (SCCER-FURIES), for their financial and technical support to this research work.

You might also like...

Python package for stacking (machine learning technique)

vecstack Python package for stacking (stacked generalization) featuring lightweight functional API and fully compatible scikit-learn API Convenient wa

671 Dec 25, 2022

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

imbalanced-learn imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-cla

6.2k Jan 1, 2023

A Python package for time series classification

pyts: a Python package for time series classification pyts is a Python package for time series classification. It aims to make time series classificat

1.4k Jan 1, 2023

ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

A library for debugging/inspecting machine learning classifiers and explaining their predictions

154 Dec 17, 2022

ArviZ is a Python package for exploratory analysis of Bayesian models

ArviZ (pronounced "AR-vees") is a Python package for exploratory analysis of Bayesian models. Includes functions for posterior analysis, data storage, model checking, comparison and diagnostics

1.3k Jan 5, 2023

Python package for machine learning for healthcare using a OMOP common data model

This library was developed in order to facilitate rapid prototyping in Python of predictive machine-learning models using longitudinal medical data from an OMOP CDM-standard database.

75 Jan 3, 2023

UpliftML: A Python Package for Scalable Uplift Modeling

UpliftML is a Python package for scalable unconstrained and constrained uplift modeling from experimental data. To accommodate working with big data, the package uses PySpark and H2O models as base learners for the uplift models. Evaluation functions expect a PySpark dataframe as input.

254 Dec 31, 2022

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

12 Jun 29, 2022

MICOM is a Python package for metabolic modeling of microbial communities

Welcome MICOM is a Python package for metabolic modeling of microbial communities currently developed in the Gibbons Lab at the Institute for Systems

57 Dec 21, 2022

Comments

Is it possible to define custom loss function ?
Dear all, First thank you for developping this tool, that I believe is of great interest. I am working with:

environmental variables (e.g. temperature, salinity)

multi-dimensional targets, that are relative abundance, with their sum = 1 for each site

Therefore, I was wondering if it is possible to implement a custom loss function in the mbtr framework, that would be adapted for proportions. Please note that I am quite new to python.

To do some testing, I tryed to dupplicate the mse loss function with another name in the losses.py file and adding the new loss in the LOSS_MAP in __inits__.py. Then I compiled the files. However, I have this error when trying to run the model from the multi_reg.py example:

>>> m = MBT(loss_type = 'mse', n_boosts=30, min_leaf=100, lambda_weights=1e-3).fit(x_tr, y_tr, do_plot=True) 3%|▎ | 1/30 [00:03<01:45, 3.63s/it] >>> m = MBT(loss_type = 'custom_mse', n_boosts=30, min_leaf=100, lambda_weights=1e-3).fit(x_tr, y_tr, do_plot=True) 0%| | 0/30 [00:00<?, ?it/s]KeyError: 'custom_mse'

It seems that the new loss is not recognised in LOSS_MAP:

>>> LOSS_MAP = {'custom_mse': losses.custom_MSE, ... 'mse': losses.MSE, ... 'time_smoother': losses.TimeSmoother, ... 'latent_variable': losses.LatentVariable, ... 'linear_regression': losses.LinRegLoss, ... 'fourier': losses.FourierLoss, ... 'quantile': losses.QuantileLoss, ... 'quadratic_quantile': losses.QuadraticQuantileLoss} AttributeError: module 'mbtr.losses' has no attribute 'custom_MSE'

I guess that I missed something when trying to dupplicate and rename the mse loss. I would appreciate any help if the definition of a custom loss function is possible.

Best regards,
opened by alexschickele 2
Dataset cannot be reached

Hi thank you for your effort to create this. I want to try this but i cannot download nor visit the web that you provided in example multivariate_forecas.py

Is there any alternative link for that dataset? thank you regards!

opened by kristfrizh 1

Error at import time with python 3.10.*

I want to use MBTR in a teaching module and I need to use jupyter-lab inside a conda environment for teaching purposes. While MBTR works as expected in a vanilla python 3.8, it errors out (on the same machine) in a conda environment using python 3.10

Steps to reproduce

conda create --name testenv
conda activate testenv

conda install -c conda-forge jupyterlab
pip install --upgrade git+https://github.com/supsi-dacd-isaac/mbtr.git
# to make sure to get the latest version; but the version on pypi gives the same error

Then

python

and in python

from mbtr.mbtr import MBT

which outputs the following error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/mbtr/mbtr.py", line 317, in <module>
    def leaf_stats(y, edges, x, order):
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/decorators.py", line 219, in wrapper
    disp.compile(sig)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/dispatcher.py", line 965, in compile
    cres = self._compiler.compile(args, return_type)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/dispatcher.py", line 129, in compile
    raise retval
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/dispatcher.py", line 139, in _compile_cached
    retval = self._compile_core(args, return_type)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/dispatcher.py", line 152, in _compile_core
    cres = compiler.compile_extra(self.targetdescr.typing_context,
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler.py", line 716, in compile_extra
    return pipeline.compile_extra(func)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler.py", line 452, in compile_extra
    return self._compile_bytecode()
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler.py", line 520, in _compile_bytecode
    return self._compile_core()
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler.py", line 499, in _compile_core
    raise e
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler.py", line 486, in _compile_core
    pm.run(self.state)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler_machinery.py", line 368, in run
    raise patched_exception
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler_machinery.py", line 356, in run
    self._runPass(idx, pass_inst, state)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler_lock.py", line 35, in _acquire_compile_lock
    return func(*args, **kwargs)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler_machinery.py", line 311, in _runPass
    mutated |= check(pss.run_pass, internal_state)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/compiler_machinery.py", line 273, in check
    mangled = func(compiler_state)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/typed_passes.py", line 105, in run_pass
    typemap, return_type, calltypes, errs = type_inference_stage(
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/typed_passes.py", line 83, in type_inference_stage
    errs = infer.propagate(raise_errors=raise_errors)
  File "/home/myself/.conda/envs/testenv/lib/python3.10/site-packages/numba/core/typeinfer.py", line 1086, in propagate
    raise errors[0]
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No conversion from UniTuple(none x 2) to UniTuple(array(float64, 2d, A) x 2) for '$116return_value.7', defined at None

File ".conda/envs/testenv/lib/python3.10/site-packages/mbtr/mbtr.py", line 327:
def leaf_stats(y, edges, x, order):
    <source elided>
        s_left, s_right = None, None
    return s_left, s_right
    ^

During: typing of assignment at /home/myself/.conda/envs/testenv/lib/python3.10/site-packages/mbtr/mbtr.py (327)

File ".conda/envs/test/lib/python3.10/site-packages/mbtr/mbtr.py", line 327:
def leaf_stats(y, edges, x, order):
    <source elided>
        s_left, s_right = None, None
    return s_left, s_right
    ^

Thanks in advance for any pointer/help. The course where I want to present this is a summer course and is closing in on me 😉

opened by jiho 0

Releases(v0.1.3)

v0.1.3(Aug 10, 2022)

Corrected quantile error loss
Source code(tar.gz)
Source code(zip)
v0.1.2(Jul 20, 2022)

Corrected quadratic quantile loss function
Source code(tar.gz)
Source code(zip)
v0.1.1(May 18, 2020)

Source code(tar.gz)
Source code(zip)
v0.1.0(May 18, 2020)

This is the first release of mbtr
Source code(tar.gz)
Source code(zip)

Owner

SUPSI-DACD-ISAAC

GitHub Repository

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

84 Nov 25, 2022

Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

396 Dec 28, 2022

A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and motion planning

pybullet-planning (previously ss-pybullet) A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and

260 Dec 27, 2022

💀mummify: a version control tool for machine learning

mummify is a version control tool for machine learning. It's simple, fast, and designed for model prototyping.

43 Jul 09, 2022

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

12 Jun 29, 2022

A framework for building (and incrementally growing) graph-based data structures used in hierarchical or DAG-structured clustering and nearest neighbor search

31 Nov 03, 2022

Predict the income for each percentile of the population (Python) - FRENCH

05.income-prediction Predict the income for each percentile of the population (Python) - FRENCH Effectuez une prédiction de revenus Prérequis Pour ce

1 Feb 13, 2022

A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

1.9k Dec 31, 2022

Predict the demand for electricity (R) - FRENCH

06.demand-electricity Predict the demand for electricity (R) - FRENCH Prédisez la demande en électricité Prérequis Pour effectuer ce projet, vous devr

1 Feb 13, 2022

Python package for concise, transparent, and accurate predictive modeling

Python package for concise, transparent, and accurate predictive modeling. All sklearn-compatible and easy to use. 📚 docs • 📖 demo notebooks Modern

983 Jan 01, 2023

Implementation of deep learning models for time series in PyTorch.

List of Implementations: Currently, the reimplementation of the DeepAR paper(DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks

275 Dec 28, 2022

2D fluid simulation implementation of Jos Stam paper on real-time fuild dynamics, including some suggested extensions.

Fluid Simulation Usage Download this repo and store it in your computer. Open a terminal and go to the root directory of this folder. Make sure you ha

5 Dec 02, 2022

This is an auto-ML tool specialized in detecting of outliers

Auto-ML tool specialized in detecting of outliers Description This tool will allows you, with a Dash visualization, to compare 10 models of machine le

1 Nov 03, 2021

Titanic Traveller Survivability Prediction

The aim of the mini project is predict whether or not a passenger survived based on attributes such as their age, sex, passenger class, where they embarked and more.

0 Jan 20, 2022

AutoOED: Automated Optimal Experiment Design Platform

AutoOED is an optimal experiment design platform powered with automated machine learning to accelerate the discovery of optimal solutions. Our platform solves multi-objective optimization problems an

107 Jan 03, 2023

scikit-learn models hyperparameters tuning and feature selection, using evolutionary algorithms.

Sklearn-genetic-opt scikit-learn models hyperparameters tuning and feature selection, using evolutionary algorithms. This is meant to be an alternativ

180 Dec 20, 2022

Pyomo is an object-oriented algebraic modeling language in Python for structured optimization problems.

Pyomo is a Python-based open-source software package that supports a diverse set of optimization capabilities for formulating and analyzing optimization models. Pyomo can be used to define symbolic p

1.4k Dec 28, 2022

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas.

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas. Its objective is to ex

54 Aug 20, 2022

Primitives for machine learning and data science.

An Open Source Project from the Data to AI Lab, at MIT MLPrimitives Pipelines and primitives for machine learning and data science. Documentation: htt

65 Dec 29, 2022

Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

1.3k Dec 28, 2022

MBTR is a python package for multivariate boosted tree regressors trained in parameter space.

Related tags

Overview

Multivariate Boosted TRee

What is MBTR

Installation

Usage

Documentation

Reference

Acknowledgments

You might also like...

Python package for stacking (machine learning technique)

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

A Python package for time series classification

ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

ArviZ is a Python package for exploratory analysis of Bayesian models

Python package for machine learning for healthcare using a OMOP common data model

UpliftML: A Python Package for Scalable Uplift Modeling

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

MICOM is a Python package for metabolic modeling of microbial communities

Comments

Is it possible to define custom loss function ?

Dataset cannot be reached

Error at import time with python 3.10.*

Releases(v0.1.3)

v0.1.3(Aug 10, 2022)

v0.1.2(Jul 20, 2022)

v0.1.1(May 18, 2020)

v0.1.0(May 18, 2020)

Owner

SUPSI-DACD-ISAAC

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Pydantic based mock data generation

A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and motion planning

💀mummify: a version control tool for machine learning

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

A framework for building (and incrementally growing) graph-based data structures used in hierarchical or DAG-structured clustering and nearest neighbor search

Predict the income for each percentile of the population (Python) - FRENCH

A modular active learning framework for Python

Predict the demand for electricity (R) - FRENCH

Python package for concise, transparent, and accurate predictive modeling

Implementation of deep learning models for time series in PyTorch.

2D fluid simulation implementation of Jos Stam paper on real-time fuild dynamics, including some suggested extensions.

This is an auto-ML tool specialized in detecting of outliers

Titanic Traveller Survivability Prediction

AutoOED: Automated Optimal Experiment Design Platform

scikit-learn models hyperparameters tuning and feature selection, using evolutionary algorithms.

Pyomo is an object-oriented algebraic modeling language in Python for structured optimization problems.

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas.

Primitives for machine learning and data science.

Metric learning algorithms in Python