Accelerating model creation and evaluation.

Last update: Dec 06, 2021

Overview

EmeraldML

A machine learning library for streamlining the process of
(1) cleaning and splitting data,
(2) training, optimizing, and testing various models based on the task, and
(3) scoring and ranking them
during the exploratory phase for an elementary analysis of which models perform better for a specific dataset.

Installation

Dependencies

Python (>= 3.7)
NumPy (>= 1.21.2)
pandas (>= 1.3.3)
scikit-learn (>= 0.24.2)
statsmodels (>= 0.12.2)

User installation

pip install emeraldml

Development

Source code

You can check the latest sources with the command:

git clone https://github.com/yu3ufff/emeraldml.git

Demo

Getting the data:

import pandas as pd
audi = pd.read_csv('audi.csv')
audi.head()

|    | model   |   year |   price | transmission   |   mileage | fuelType   |   tax |   mpg |   engineSize |
|---:|:--------|-------:|--------:|:---------------|----------:|:-----------|------:|------:|-------------:|
|  0 | A1      |   2017 |   12500 | Manual         |     15735 | Petrol     |   150 |  55.4 |          1.4 |
|  1 | A6      |   2016 |   16500 | Automatic      |     36203 | Diesel     |    20 |  64.2 |          2   |
|  2 | A1      |   2016 |   11000 | Manual         |     29946 | Petrol     |    30 |  55.4 |          1.4 |
|  3 | A4      |   2017 |   16800 | Automatic      |     25952 | Diesel     |   145 |  67.3 |          2   |
|  4 | A3      |   2019 |   17300 | Manual         |      1998 | Petrol     |   145 |  49.6 |          1   |

Using EmeraldML:

import emerald
from emerald.boa import RegressionBoa

rboa = RegressionBoa(random_state=3)
rboa.hunt(data=audi, target='price')
rboa.ladder

[(OptimalRFRegressor, 0.9624889664024406),
 (OptimalDTRegressor, 0.9514992411732952),
 (OptimalKNRegressor, 0.9511411883559433),
 (OptimalLinearRegression, 0.8876961846248467),
 (OptimalABRegressor, 0.8491539140007975)]

for i in range(len(rboa)):
    print(rboa.model(i))

RandomForestRegressor(min_samples_split=5, n_estimators=500, random_state=3)
DecisionTreeRegressor(max_depth=15, min_samples_split=10, random_state=3)
KNeighborsRegressor(n_neighbors=3, p=1)
LinearRegression()
AdaBoostRegressor(learning_rate=0.1, n_estimators=100, random_state=3)

Accelerating model creation and evaluation.

Related tags

Overview

EmeraldML

Installation

Dependencies

User installation

Development

Source code

Demo

Owner

Yusuf

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

Automated machine learning: Review of the state-of-the-art and opportunities for healthcare

GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

Distributed Evolutionary Algorithms in Python

A visual dataflow programming language for sklearn

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

A Collection of Conference & School Notes in Machine Learning 🦄📝🎉

Model factory is a ML training platform to help engineers to build ML models at scale

A machine learning web application for binary classification using streamlit

Responsible AI Workshop: a series of tutorials & walkthroughs to illustrate how put responsible AI into practice

Open source time series library for Python

MasTrade is a trading bot in baselines3,pytorch,gym

Self Organising Map (SOM) for clustering of atomistic samples through unsupervised learning.

2D fluid simulation implementation of Jos Stam paper on real-time fuild dynamics, including some suggested extensions.

List of Data Science Cheatsheets to rule the world

Dieses Projekt ermöglicht es den Smartmeter der EVN (Netz Niederösterreich) über die Kundenschnittstelle auszulesen.

MegFlow - Efficient ML solutions for long-tailed demands.

MICOM is a Python package for metabolic modeling of microbial communities

Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets

scikit-multimodallearn is a Python package implementing algorithms multimodal data.