Accelerating model creation and evaluation.

Last update: Dec 06, 2021

Overview

EmeraldML

A machine learning library for streamlining the process of
(1) cleaning and splitting data,
(2) training, optimizing, and testing various models based on the task, and
(3) scoring and ranking them
during the exploratory phase for an elementary analysis of which models perform better for a specific dataset.

Installation

Dependencies

Python (>= 3.7)
NumPy (>= 1.21.2)
pandas (>= 1.3.3)
scikit-learn (>= 0.24.2)
statsmodels (>= 0.12.2)

User installation

pip install emeraldml

Development

Source code

You can check the latest sources with the command:

git clone https://github.com/yu3ufff/emeraldml.git

Demo

Getting the data:

import pandas as pd
audi = pd.read_csv('audi.csv')
audi.head()

|    | model   |   year |   price | transmission   |   mileage | fuelType   |   tax |   mpg |   engineSize |
|---:|:--------|-------:|--------:|:---------------|----------:|:-----------|------:|------:|-------------:|
|  0 | A1      |   2017 |   12500 | Manual         |     15735 | Petrol     |   150 |  55.4 |          1.4 |
|  1 | A6      |   2016 |   16500 | Automatic      |     36203 | Diesel     |    20 |  64.2 |          2   |
|  2 | A1      |   2016 |   11000 | Manual         |     29946 | Petrol     |    30 |  55.4 |          1.4 |
|  3 | A4      |   2017 |   16800 | Automatic      |     25952 | Diesel     |   145 |  67.3 |          2   |
|  4 | A3      |   2019 |   17300 | Manual         |      1998 | Petrol     |   145 |  49.6 |          1   |

Using EmeraldML:

import emerald
from emerald.boa import RegressionBoa

rboa = RegressionBoa(random_state=3)
rboa.hunt(data=audi, target='price')
rboa.ladder

[(OptimalRFRegressor, 0.9624889664024406),
 (OptimalDTRegressor, 0.9514992411732952),
 (OptimalKNRegressor, 0.9511411883559433),
 (OptimalLinearRegression, 0.8876961846248467),
 (OptimalABRegressor, 0.8491539140007975)]

for i in range(len(rboa)):
    print(rboa.model(i))

RandomForestRegressor(min_samples_split=5, n_estimators=500, random_state=3)
DecisionTreeRegressor(max_depth=15, min_samples_split=10, random_state=3)
KNeighborsRegressor(n_neighbors=3, p=1)
LinearRegression()
AdaBoostRegressor(learning_rate=0.1, n_estimators=100, random_state=3)

Accelerating model creation and evaluation.

Related tags

Overview

EmeraldML

Installation

Dependencies

User installation

Development

Source code

Demo

Owner

Yusuf

Bayesian optimization in JAX

🤖 ⚡ scikit-learn tips

Xeasy-ml is a packaged machine learning framework.

Python-based implementations of algorithms for learning on imbalanced data.

monolish: MONOlithic Liner equation Solvers for Highly-parallel architecture

Fundamentals of Machine Learning

An AutoML survey focusing on practical systems.

Getting Profit and Loss Make Easy From Binance

CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system

Implementations of Machine Learning models, Regularizers, Optimizers and different Cost functions.

Decision Tree Regression algorithm implemented on Python from scratch.

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Solve automatic numerical differentiation problems in one or more variables.

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine Learning work with thousands of other users.

Gaussian Process Optimization using GPy

This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

The Fuzzy Labs guide to the universe of open source MLOps

A Python step-by-step primer for Machine Learning and Optimization

Covid-polygraph - a set of Machine Learning-driven fact-checking tools