Python based GBDT implementation

Last update: Sep 21, 2022

Related tags

Machine Learning Py-Boost

Overview

Py-boost: a research tool for exploring GBDTs

Modern gradient boosting toolkits are very complex and are written in low-level programming languages. As a result,

It is hard to customize them to suit one’s needs
New ideas and methods are not easy to implement
It is difficult to understand how they work

Py-boost is a Python-based gradient boosting library which aims at overcoming the aforementioned problems.

Authors: Anton Vakhrushev, Leonid Iosipoi.

Py-boost Key Features

Simple. Py-boost is a simplified gradient boosting library but it supports all main features and hyperparameters available in other implementations.

Fast with GPU. Despite the fact that Py-boost is written in Python, it works only on GPU and uses Python GPU libraries such as CuPy and Numba.

Easy to customize. Py-boost can be easily customized even if one is not familiar with GPU programming (just replace np with cp). What can be customized? Almost everuthing via custom callbacks. Examples: Row/Col sampling strategy, Training control, Losses/metrics, Multioutput handling strategy, Anything via custom callbacks

Installation

Before installing py-boost via pip you should have cupy installed. You can use:

pip install -U cupy-cuda110 py-boost

Note: replace with your cuda version! For the details see this guide

Quick tour

Py-boost is easy to use since it has similar to scikit-learn interface. For usage example please see:

Tutorial_1_Basics for simple usage examples
Tutorial_2_Advanced_multioutput for advanced multioutput features
Tutorial_3_Custom_features for examples of customization

More examples are comming soon

Other Sber AI Lab Projects

LightAutoML: https://github.com/sberbank-ai-lab/LightAutoML
AutoWoE: https://github.com/sberbank-ai-lab/AutoMLWhitebox
RePlay: https://github.com/sberbank-ai-lab/RePlay

Python based GBDT implementation

Related tags

Overview

Py-boost: a research tool for exploring GBDTs

Py-boost Key Features

Installation

Quick tour

Other Sber AI Lab Projects

Owner

Sberbank AI Lab

Implementation of K-Nearest Neighbors Algorithm Using PySpark

It is a forest of random projection trees

Machine Learning from Scratch

A python fast implementation of the famous SVD algorithm popularized by Simon Funk during Netflix Prize

A GitHub action that suggests type annotations for Python using machine learning.

Decision Weights in Prospect Theory

K-Means clusternig example with Python and Scikit-learn

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

The code from the Machine Learning Bookcamp book and a free course based on the book

Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

Implementation of different ML Algorithms from scratch, written in Python 3.x

LibTraffic is a unified, flexible and comprehensive traffic prediction library based on PyTorch

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

Fit interpretable models. Explain blackbox machine learning.

Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis.

Nixtla is an open-source time series forecasting library.

All-in-one web-based development environment for machine learning

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Crate will be the hub of various ML projects which will be the resources for the ML enthusiasts! Open Source Program: SWOC 2021 and JWOC 2022.

AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker