Scikit-Learn useful pre-defined Pipelines Hub

Last update: Apr 26, 2022

Overview

Scikit-Pipes

Scikit-Learn useful pre-defined Pipelines Hub

Usage:

Install scikit-pipes

It's advised to install sklearn-genetic using a virtual env, inside the env use:

pip install scikit-pipes

Example: Simple Preprocessing

import pandas as pd
import numpy as np
from skpipes.pipeline import SkPipeline

data = [{"x1": 1, "x2": 400, "x3": np.nan},
        {"x1": 4.8, "x2": 250, "x3": 50},
        {"x1": 3, "x2": 140, "x3": 43},
        {"x1": 1.4, "x2": 357, "x3": 75},
        {"x1": 2.4, "x2": np.nan, "x3": 42},
        {"x1": 4, "x2": 287, "x3": 21}]

df = pd.DataFrame(data)

pipe = SkPipeline(name='imputer_median-minmax',
                  data_type="numerical")
pipe.steps
str(pipe)

pipe.fit(df)
pipe.transform(df)
pipe.fit_transform(df)

Changelog

See the changelog for notes on the changes of Sklearn-genetic-opt

Important links

Official source code repo: https://github.com/rodrigo-arenas/scikit-pipes/
Download releases: https://pypi.org/project/scikit-pipes/
Issue tracker: https://github.com/rodrigo-arenas/scikit-pipes/issues
Stable documentation: https://scikit-pipes.readthedocs.io/en/stable/

Source code

You can check the latest development version with the command:

git clone https://github.com/rodrigo-arenas/scikit-pipes.git

Install the development dependencies:

pip install -r dev-requirements.txt

Check the latest in-development documentation: https://scikit-pipes.readthedocs.io/en/latest/

Testing

After installation, you can launch the test suite from outside the source directory:

pytest skpipes

Scikit-Learn useful pre-defined Pipelines Hub

Related tags

Overview

Scikit-Pipes

Usage:

Example: Simple Preprocessing

Changelog

Important links

Source code

Testing

Owner

Rodrigo Arenas

Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill

TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.

This is a Cricket Score Predictor that predicts the first innings score of a T20 Cricket match using Machine Learning

dirty_cat is a Python module for machine-learning on dirty categorical variables.

Nixtla is an open-source time series forecasting library.

fMRIprep Pipeline To Machine Learning

Made in collaboration with Chris George for Art + ML Spring 2019.

MaD GUI is a basis for graphical annotation and computational analysis of time series data.

Machine learning template for projects based on sklearn library.

A repository for collating all the resources such as articles, blogs, papers, and books related to Bayesian Statistics.

Responsible AI Workshop: a series of tutorials & walkthroughs to illustrate how put responsible AI into practice

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

This is a curated list of medical data for machine learning

Simple data balancing baselines for worst-group-accuracy benchmarks.

mlpack: a scalable C++ machine learning library --

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

A Powerful Serverless Analysis Toolkit That Takes Trial And Error Out of Machine Learning Projects

pymc-learn: Practical Probabilistic Machine Learning in Python

Xeasy-ml is a packaged machine learning framework.

A toolkit for geo ML data processing and model evaluation (fork of solaris)