A simple and lightweight genetic algorithm for optimization of any machine learning model

Overview

geneticml

Actions Status CodeQL PyPI License

This package contains a simple and lightweight genetic algorithm for optimization of any machine learning model.

Installation

Use pip to install the package from PyPI:

pip install geneticml

Usage

This package provides a easy way to create estimators and perform the optimization with genetic algorithms. The example below describe in details how to create a simulation with genetic algorithms using evolutionary approach to train a sklearn.neural_network.MLPClassifier. A full list of examples could be found here.

from geneticml.optimizers import GeneticOptimizer
from geneticml.strategy import EvolutionaryStrategy
from geneticml.algorithms import EstimatorBuilder
from metrics import metric_accuracy
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_iris

# Creates a custom fit method
def fit(model, x, y):
    return model.fit(x, y)

# Creates a custom predict method
def predict(model, x):
    return model.predict(x)

if __name__ == "__main__":

    seed = 11412

    # Creates an estimator
    estimator = EstimatorBuilder()\
        .of(model_type=MLPClassifier)\
        .fit_with(func=fit)\
        .predict_with(func=predict)\
        .build()

    # Defines a strategy for the optimization
    strategy = EvolutionaryStrategy(
        estimator_type=estimator,
        parameters=parameters,
        retain=0.4,
        random_select=0.1,
        mutate_chance=0.2,
        max_children=2,
        random_state=seed
    )

    # Creates the optimizer
    optimizer = GeneticOptimizer(strategy=strategy)

    # Loads the data
    data = load_iris()

    # Defines the metric
    metric = metric_accuracy
    greater_is_better = True

    # Create the simulation using the optimizer and the strategy
    models = optimizer.simulate(
        data=data.data, 
        target=data.target,
        generations=generations,
        population=population,
        evaluation_function=metric,
        greater_is_better=greater_is_better,
        verbose=True
    )

The estimator is the way you define an algorithm or a class that will be used for model instantiation

estimator = EstimatorBuilder().of(model_type=MLPClassifier).fit_with(func=fit).predict_with(func=predict).build()

You need to speficy a custom fit and predict functions. These functions need to use the same signature than the below ones. This happens because the algorithm is generic and needs to know how to perform the fit and predict functions for the models.

# Creates a custom fit method
def fit(model, x, y):
    return model.fit(x, y)

# Creates a custom predict method
def predict(model, x):
    return model.predict(x)

Custom strategy

You can create custom strategies for the optimizers by extending the geneticml.strategy.BaseStrategy and implementing the execute(...) function.

class MyCustomStrategy(BaseStrategy):
    def __init__(self, estimator_type: Type[BaseEstimator]) -> None:
        super().__init__(estimator_type)

    def execute(self, population: List[Type[T]]) -> List[T]:
        return population

The custom strategies will allow you to create optimization strategies to archive your goals. We currently have the evolutionary strategy but you can define your own :)

Custom optimizer

You can create custom optimizers by extending the geneticml.optimizers.BaseOptimizer and implementing the simulate(...) function.

class MyCustomOptimizer(BaseOptimizer):
    def __init__(self, strategy: Type[BaseStrategy]) -> None:
        super().__init__(strategy)

    def simulate(self, data, target, verbose: bool = True) -> List[T]:
        """
        Generate a network with the genetic algorithm.

        Parameters:
            data (?): The data used to train the algorithm
            target (?): The targets used to train the algorithm
            verbose (bool): True if should verbose or False if not

        Returns:
            (List[BaseEstimator]): A list with the final population sorted by their loss
        """
        estimators = self._strategy.create_population()
        for x in estimators:
            x.fit(data, target)
            y_pred = x.predict(target)
        pass 

Custom optimizers will let you define how you want your algorithm to optimize the selected strategy. You can also combine custom strategies and optimizers to archive your desire objective.

Testing

The following are the steps to create a virtual environment into a folder named "venv" and install the requirements.

# Create virtualenv
python3 -m venv venv
# activate virtualenv
source venv/bin/activate
# update packages
pip install --upgrade pip setuptools wheel
# install requirements
python setup.py install

Tests can be run with python setup.py test when the virtualenv is active.

Contributing

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

A detailed overview on how to contribute can be found in the contributing guide. There is also an overview on GitHub.

If you are simply looking to start working with the geneticml codebase, navigate to the GitHub "issues" tab and start looking through interesting issues. Or maybe through using geneticml you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’...you can do something about it!

Feel free to ask questions on the mailing the contributors.

Changelog

1.0.3 - Included pytorch example

1.0.2 - Minor fixes on naming

1.0.1 - README fixes

1.0.0 - First release

You might also like...
A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.
A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

sne4onnx A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

NLopt is a library for nonlinear local and global optimization, for functions with and without gradient information. It is designed as a simple, unifi

A Lightweight Hyperparameter Optimization Tool 🚀
A Lightweight Hyperparameter Optimization Tool 🚀

Lightweight Hyperparameter Optimization 🚀 The mle-hyperopt package provides a simple and intuitive API for hyperparameter optimization of your Machin

A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.

Karoo GP Karoo GP is an evolutionary algorithm, a genetic programming application suite written in Python which supports both symbolic regression and

Simulate genealogical trees and genomic sequence data using population genetic models

msprime msprime is a population genetics simulator based on tskit. Msprime can simulate random ancestral histories for a sample of individuals (consis

MBPO (paper: When to trust your model: Model-based policy optimization) in offline RL settings

offline-MBPO This repository contains the code of a version of model-based RL algorithm MBPO, which is modified to perform in offline RL settings Pape

RoMA: Robust Model Adaptation for Offline Model-based Optimization

RoMA: Robust Model Adaptation for Offline Model-based Optimization Implementation of RoMA: Robust Model Adaptation for Offline Model-based Optimizatio

Ever felt tired after preprocessing the dataset, and not wanting to write any code further to train your model? Ever encountered a situation where you wanted to record the hyperparameters of the trained model and able to retrieve it afterward? Models Playground is here to help you do that. Models playground allows you to train your models right from the browser.
Comments
  • feature/data_sampling

    feature/data_sampling

    We added support to run your own data sampling (e.g., imblearn.SMOTE) and use the genetic algorithms to find the best set parameters for them. Also, you can find the best set of parameters for your machine learning model at same time that find the best minority class size that maximizes the model score

    opened by albarsil 0
Releases(1.0.8)
Owner
Allan Barcelos
Lead Data Scientist, Conference Speaker, Startup Mentor and AI Consultant
Allan Barcelos
DeepCAD: A Deep Generative Network for Computer-Aided Design Models

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
ICON: Implicit Clothed humans Obtained from Normals

ICON: Implicit Clothed humans Obtained from Normals arXiv, December 2021. Yuliang Xiu · Jinlong Yang · Dimitrios Tzionas · Michael J. Black Table of C

Yuliang Xiu 1.1k Dec 30, 2022
Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Improving Transferability of Representations via Augmentation-Aware Self-Supervision Accepted to NeurIPS 2021 TL;DR: Learning augmentation-aware infor

hankook 38 Sep 16, 2022
Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

Img-process-manual - Opencv Library basic graphic processing algorithm coding reproduction based on Numpy and Matplotlib library

Jack_Shaw 2 Dec 12, 2022
SeMask: Semantically Masked Transformers for Semantic Segmentation.

SeMask: Semantically Masked Transformers Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi This repo co

Picsart AI Research (PAIR) 186 Dec 30, 2022
Pytorch implementation for RelTransformer

RelTransformer Our Architecture This is a Pytorch implementation for RelTransformer The implementation for Evaluating on VG200 can be found here Requi

Vision CAIR Research Group, KAUST 21 Nov 22, 2022
Code & Experiments for "LILA: Language-Informed Latent Actions" to be presented at the Conference on Robot Learning (CoRL) 2021.

LILA LILA: Language-Informed Latent Actions Code and Experiments for Language-Informed Latent Actions (LILA), for using natural language to guide assi

Sidd Karamcheti 11 Nov 25, 2022
This is the repository for Learning to Generate Piano Music With Sustain Pedals

SusPedal-Gen This is the official repository of Learning to Generate Piano Music With Sustain Pedals Demo Page Dataset The dataset used in this projec

Joann Ching 12 Sep 02, 2022
An ML & Correlation platform for transforming disparate data points of interest into usable intelligence.

SSIDprobeCollector An ML & Correlation platform for transforming disparate data points of interest into usable intelligence. At a High level the platf

Bill Reyor 1 Jan 30, 2022
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Collie do

ShopRunner 96 Dec 29, 2022
QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper)

QAHOI QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper) Requirements PyTorch = 1.5.1 torchvision = 0.6.1 pip install -r requ

38 Dec 29, 2022
Ros2-voiceroid2 - ROS2 wrapper package of VOICEROID2

ros2_voiceroid2 ROS2 wrapper package of VOICEROID2 Windows Only Installation Ins

Nkyoku 1 Jan 23, 2022
Python based Advanced AI Assistant

Knick is a virtual artificial intelligence project, fully developed in python. The objective of this project is to develop a virtual assistant that can handle our minor, intermediate as well as heavy

19 Nov 15, 2022
Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs MATLAB implementation of the paper: P. Mercado, F. Tudisco, and M. Hein,

Pedro Mercado 6 May 26, 2022
A fuzzing framework for SMT solvers

yinyang A fuzzing framework for SMT solvers. Given a set of seed SMT formulas, yinyang generates mutant formulas to stress-test SMT solvers. yinyang c

Project Yin-Yang for SMT Solver Testing 145 Jan 04, 2023
A Tensorflow implementation of BicycleGAN.

BicycleGAN implementation in Tensorflow As part of the implementation series of Joseph Lim's group at USC, our motivation is to accelerate (or sometim

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 97 Dec 02, 2022
Pytorch Lightning Implementation of SC-Depth Methods.

SC_Depth_pl: This is a pytorch lightning implementation of SC-Depth (V1, V2) for self-supervised learning of monocular depth from video. In the V1 (IJ

JiaWang Bian 216 Dec 30, 2022
Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Şebnem 6 Jan 18, 2022
Benchmarks for the Optimal Power Flow Problem

Power Grid Lib - Optimal Power Flow This benchmark library is curated and maintained by the IEEE PES Task Force on Benchmarks for Validation of Emergi

A Library of IEEE PES Power Grid Benchmarks 207 Dec 08, 2022
Yet Another Reinforcement Learning Tutorial

This repo contains self-contained RL implementations

Sungjoon 65 Dec 10, 2022