Evol is clear dsl for composable evolutionary algorithms that optimised for joy.

Overview

Documentation StatusDownloads Build Status Documentation StatusDownloads

Imgur

Evol is clear dsl for composable evolutionary algorithms that optimised for joy.

Installation

We currently support python3.6 and python3.7 and you can install it via pip.

pip install evol

Documentation

For more details you can read the docs but we advice everyone to get start by first checking out the examples in the /examples directory. These stand alone examples should show the spirit of usage better than the docs.

The Gist

The main idea is that you should be able to define a complex algorithm in a composable way. To explain what we mean by this: let's consider two evolutionary algorithms for travelling salesman problems.

The first approach takes a collections of solutions and applies:

  1. a survival where only the top 50% solutions survive
  2. the population reproduces using a crossover of genes
  3. certain members mutate
  4. repeat this, maybe 1000 times or more!

Drawing

We can also think of another approach:

  1. pick the best solution of the population
  2. make random changes to this parent and generate new solutions
  3. repeat this, maybe 1000 times or more!

Drawing

One could even combine the two algorithms into a new one:

  1. run algorithm 1 50 times
  2. run algorithm 2 10 times
  3. repeat this, maybe 1000 times or more!

Drawing

You might notice that many parts of these algorithms are similar and it is the goal of this library is to automate these parts. We hope to provide an API that is fun to use and easy to tweak your heuristics in.

A working example of something silimar to what is depicted above is shown below. You can also find this code as an example in the /examples/simple_nonlinear.py.

import random
from evol import Population, Evolution

random.seed(42)

def random_start():
    """
    This function generates a random (x,y) coordinate
    """
    return (random.random() - 0.5) * 20, (random.random() - 0.5) * 20

def func_to_optimise(xy):
    """
    This is the function we want to optimise (maximize)
    """
    x, y = xy
    return -(1-x)**2 - 2*(2-x**2)**2

def pick_random_parents(pop):
    """
    This is how we are going to select parents from the population
    """
    mom = random.choice(pop)
    dad = random.choice(pop)
    return mom, dad

def make_child(mom, dad):
    """
    This function describes how two candidates combine into a new candidate
    Note that the output is a tuple, just like the output of `random_start`
    We leave it to the developer to ensure that chromosomes are of the same type
    """
    child_x = (mom[0] + dad[0])/2
    child_y = (mom[1] + dad[1])/2
    return child_x, child_y

def add_noise(chromosome, sigma):
    """
    This is a function that will add some noise to the chromosome.
    """
    new_x = chromosome[0] + (random.random()-0.5) * sigma
    new_y = chromosome[1] + (random.random()-0.5) * sigma
    return new_x, new_y

# We start by defining a population with candidates.
pop = Population(chromosomes=[random_start() for _ in range(200)],
                 eval_function=func_to_optimise, maximize=True)

# We define a sequence of steps to change these candidates
evo1 = (Evolution()
       .survive(fraction=0.5)
       .breed(parent_picker=pick_random_parents, combiner=make_child)
       .mutate(func=add_noise, sigma=1))

# We define another sequence of steps to change these candidates
evo2 = (Evolution()
       .survive(n=1)
       .breed(parent_picker=pick_random_parents, combiner=make_child)
       .mutate(func=add_noise, sigma=0.2))

# We are combining two evolutions into a third one. You don't have to
# but this approach demonstrates the flexibility of the library.
evo3 = (Evolution()
       .repeat(evo1, n=50)
       .repeat(evo2, n=10)
       .evaluate())

# In this step we are telling evol to apply the evolutions
# to the population of candidates.
pop = pop.evolve(evo3, n=5)
print(f"the best score found: {max([i.fitness for i in pop])}")

Getting Started

The best place to get started is the /examples folder on github. This folder contains self contained examples that work out of the box.

How does it compare to ...

  • ... deap? We think our library is more composable and pythonic while not removing any functionality. Our library may be a bit slower though.
  • ... hyperopt? Since we force the user to make the actual algorithm we are less black boxy. Hyperopt is meant for hyperparameter tuning for machine learning and has better support for search in scikit learn.
  • ... inspyred? The library offers a simple way to get started but it seems the project is less actively maintained than ours.
Comments
  • How to set up evolution steps

    How to set up evolution steps

    The evolution will contain a chain of steps, which can in sequence be applied to a population. In principle these steps will need to be no more than functions - we could simply do

    for step in self.chain:
        population = step(population)
    

    These step functions would simply look like:

    def step(population):
        return population.evaluate()
    

    but we shouldn't replace them by lambda's for the sake of pickling.

    In order to provide arguments to the methods of the population inside such functions we could re-define the functions with the proper arguments for each time we use them.

    Now on the other hand it would be nice to be able to give a name to such a step, for example to identify the step in logging or to make it easier to debug. We could achieve that by creating a namedtuple with name and function fields. Once we have a namedtuple, we could decide to put the arguments not baked into the function, but in their own field. The application of the steps would then look like

    for step in self.chain:
        population = step.function(population, **step.kwargs)
    

    We can take it one step further by defining an EvolutionStep class, and make classes for each of the step types inheriting from that class. This would look like:

    class EvolutionStep:
    
        def __init__(self, name, **kwargs):
            self.name = name
            self.kwargs = kwargs
    
    
    class EvaluationStep(EvolutionStep):
    
        def __init__(self, name, lazy=False):
            EvolutionStep.__init__(self, name=name, lazy=lazy)
    
        def apply(self, population):
            return population.evaluate(**self.kwargs)
    

    which has the very nice property that the arguments are well taken care off as the function is only defined once (as a method to the step class). In this case the Evolution.evaluate method can look as simple as

    def evaluate(self):
         result = copy(self)
         result.chain.append(EvolutionStep())
         return result
    

    On the other hand we will need to define a separate Step class for each operation.

    @koaning, any preference?

    opened by rogiervandergeer 16
  • Islands

    Islands

    I've created a pull request for the islands (#37), but we still need to do some work before we can merge this. In this issue I'll explain some of the changes I've made, and I have a few design decisions to make.

    Let's start at the beginning; we have a Population, which is a collection of Individuals on which we can perform a multitude of operations (evaluate, mutate, etc). The Population can also take an Evolution object and perform a series of such operations.

    Now we want to introduce islands, where the population is split into several subgroups, on each of which we can perform the same operations as on the Population. So we add two (or three, see below) operations: split and combine in which we can split a population in multiple islands and then merge these islands back into a single population.

    I have assumed that we will always want the islands to go through the same evolution. I.e. if we split a population in two islands, and then we apply an evolution on those islands, that the evolution will be the same for both islands. Note that we can pass an island-id to the combiner, picker and mutator functions such that these behave differently on each island - but we assume that the order of the steps is the same.

    Question 1: is that a reasonable assumption?

    If we will always apply an operation to all islands in the same time, then the interface to the islands will be the same as to the Population: calling mutate on a group of islands will simply call mutate on each island. The same goes for all other operations (except split and combine). Hence a group of islands behaves exactly the same as a Population, and we can apply the same Evolution on each (as long as we do not use split or combine). Since the interface to a group of islands is so close to that of a Population, it makes sense to name the group of islands an IslandPopulation.

    Now we have Population, which is a collection of Individuals, and IslandPopulation, which has the same interface as Population (except, again, for split and combine) and which is a collection of Populations. Since Population and IslandPopulation have mostly the same interface it makes sense for them to be of the same type, but as IslandPopulation contains Populations, they cannot inherit from each other. Therefore I've created an abstract base class called PopulationBase of which both the Population and the IslandPopulation inherit. (And since ContestPopulation inherits from Population, also that inherits from PopulationBase.) The great thing about this is that IslandPopulation no longer has to be a collection of Population objects per se, but can be a collection of PopulationBase objects - making it a collection of any type of population.

    The above means that operations like mutate and evaluate can be performed on any object of type PopulationBase. But I've had to make exceptions for split and combine, since these transform a Population into an IslandPopulation and vice versa. For example, calling split on a Population should clearly result in an IslandPopulation, while calling combine on an IslandPopulation should result in a Population (or ContestPopulation if it was a collection of those). Naturally, Population.combine should not exist (or throw an error, or do nothing). And now comes the difficult part:

    Question 2: what should IslandPopulation.split do? Should it

    1. Refuse to work, just like you cannot combine a Population.
    2. Pass the split to each island, returning now the same IslandPopulation, containing not Populations but IslandPopulations . E.g. A[1, 2, 3] -> A[1[a, b], 2[c, d], 3[e, f]]
    3. Split itself, such that it returns _a new IslandPopulation, containing islands which contain the original Populations. E.g. A[1, 2, 3, 4] -> q[A[1, 2], B[3, 4]]
    4. Allow both options 2 and 3 using for example a flag pushdown or level.
    5. Split the internal populations such that the original IslandPopulation now contains more islands. E.g. A[1, 2] -> A[1a, 1b, 2a, 2b].

    If we choose for one of the options 2-4 we will have to deal with many layers of sub-islands. This makes functions like combine much more complicated, since we would have to specify which layer to combine (or always push down as far as possible) and verify that that layer exists. Also, if we allow sub-islands, we need to make sure that the depth of each branch stays the same. If someone manipulates an IslandPopulation to look like this: A[1, 2[c, d]], then what do we do if he requests the second layer to be combined?

    Next, we need to decide the workings of split. Intuitively I find that split should split the population in several groups, such that each individuals ends up in only one of them. I noticed that @koaning expects each individual to end up in each group, effectively duplicating the original Population. Since these are two fundamentally different operations, I propose we make both split() and duplicate() available.

    Question 3: do you agree?

    This does mean that we have to keep track of the original intended_size somehow. For example, suppose I have a Population of size 100. If I duplicate this into four groups, I end up with an IslandPopulation consisting of four islands with 400 individuals total. However, if I split it into four groups, I get an IslandPopulation consisting of four each with 25 individuals each, and hence 100 total. If we recombine these, we want to obtain a Population with intended_size == 100. Of course, when we combine the duplicated islands we will have an initial population of size 400, but this should be reduced back to 100 (either immediately or after the first survive, breed steps).

    For the duplicated islands, it is clear that the intended_size of each island is 100. For the split islands, since their initial size is 25, this is not so clear.

    Question 4: which is the best option:

    1. When using split, the intended size of each island is a fraction of the original (25 in the example), and when we use duplicate the intended size of each island is the same as the original (100 in the example). We keep track of the original intended size of the whole population and use that when we combine.
    2. When using split or duplicate, the intended size of the islands is a fraction of the original (25 in the example). In the case of duplicate, each island will initially be much larger than intended.
    3. Same as 2, but in the case of duplicate, we first use survive to cut the original population back to 25 before we duplicate.
    4. When using split or duplicate, the intended size of the islands is the same as the original (100). In the case of split each island will initially be much smaller than intended.
    5. Same as 4, but in the case of split we call breed immediately after the split such that all populations are of the right size.

    Of course we may allow the user to override the intended size at any point.

    @koaning I'd love to hear your opinion

    opened by rogiervandergeer 13
  • Thoughts on Parallelism

    Thoughts on Parallelism

    We already apply some performance tricks with the .evaluate() mechanic but we may be able to add some form of parallelism/queing to perhaps make things even more performant.

    In terms of easy win: it seems like the .map (and thus mutate) can be run in parallel in general. Same would hold for .evaluate() in the BasePopulation.

    Do we want to explore this?

    opened by koaning 11
  • Keeping track of the fittest individual

    Keeping track of the fittest individual

    In addition to logging (as discussed in #15) and checkpointing (as discussed in #33) I think we need to keep track of the best-performing individual. I think we need to do this separately from logging for two reasons:

    • the best individual is something you always want to find, irregardless of whether you want to log,
    • if logging is going to happen at an interval other than once every iteration, there is a chance that you do not log the best performing individual when it mutates before the logging moment.

    We could implement this by storing a single individual inside the Population. I would suggest to make this part of the evaluate call, and always (whether the call is lazy or not) replace the current best by an individual when it is evaluated and its fitness score is better than the current best.

    Naturally the result of this is nonsense for the ContestPopulation, as there the fitness depends on the rest of the population too. In this case it would only make sense to store the entire population together with the fittest individual, otherwise it would be impossible to recompute the result. Of course we cannot store populations inside populations, and this would be a typical case to be solved by logging.

    Even in a normal Population, for stochastic evaluation functions the 'best individual' may of course be a lucky shot; but I think this is not a problem for us to solve.

    Technically speaking, to me it feels like we want to store the best individual in a variable named something like best or historical_best. Currently we have min_individual and max_individual - and I don't recall why we didn't implement a current_best (or best). @koaning do you remember the reason?

    opened by rogiervandergeer 9
  • Cost functions

    Cost functions

    Would we want to have problem instances to be featured in our library as well? It may help people to get started. Say to have cost functions for:

    • easy/hard continous search spaces
    • basic TSP problems
    • 8-queen-like problems
    opened by koaning 8
  • Logging

    Logging

    @rogiervandergeer i am creating this pull request merely to get your attention and to shoot at some things. the point is that i think parts are correct, but a discussion may be helpful now.

    ive stumbled on some things that need discussion along with logging.

    1. i am proposing that we pass an evol.Logger object to the population apon initialisation. if you do not pass anything in then we will use the baselogger
    2. if you call .log() via the population/evolution then the logger that is passed in initialisation will determine what and how to log.
    3. you can see in travis what the output of a baselogger is, the current idea is to print to standardout if no file is given. it will log to a file instead if a file is given.
    4. the evol.Logger object is split in two verbs: .log() and .handle(). the idea is that the former will deal with the what to handle and the latter will deal with the how to handle it/where to put it. this should leave it free for anybody to write a custom logger but allows us to come up with a few basic ones that are general enough. we can have a BaseLogger to just log the population performance but offer a PerformanceLogger that logs the difference in time between two log moments.
    5. i am currently logging the datetime, an id for the population, an id for the individual, the fitness of every individual and the chromosome. is this well enough?
    6. there is a todo at line 202 in population.py -> do we still want to keep track of a populations/individuals age? i think we don't need it, it may be nicer to keep track of a generation via the logging object (ie. we keep track of how often .log() is called an refer to that as a generation).
    7. we can try to use pythons logging framework here, but i am wondering if we need it. opinions?
    opened by koaning 7
  • Rebranding it as MapReproduce? ;)

    Rebranding it as MapReproduce? ;)

    I know it's kind of arrogant to be some random person hinting for renaming a repo... but I still think it is a good idea. (At PyData Warsaw I chuckled a few times. Yet, 3 days later I still think it is worth doing.)

    As it is essentially a functional (map, filter, kind-of-reduce) approach to evolutionary algorithms, MapReproduce would capture its spirit in its name.

    (No, I don't mean renaming any parts of API, at least not for this reason.)

    opened by stared 7
  • Python version

    Python version

    Currently we only support python 3.6, since we use f'{string}' syntax. On many platforms python3.6 isn't readily available (e.g. raspberry pi, on which I haven't even been able to compile it so far) - don't we want to support 3.4 too?

    opened by rogiervandergeer 7
  • examples won't run, import error

    examples won't run, import error

    I am having trouble making sense of this one.

    ➜  evol git:(master) ✗ pip uninstall evol; pip install .
    Uninstalling evol-0.1:
      /anaconda/lib/python3.6/site-packages/evol-0.1-py3.6.egg-info
    Proceed (y/n)? y
      Successfully uninstalled evol-0.1
    Processing /Users/code/Development/evol
    Requirement already satisfied: pytest in /anaconda/lib/python3.6/site-packages (from evol==0.1)
    Requirement already satisfied: py>=1.4.29 in /anaconda/lib/python3.6/site-packages (from pytest->evol==0.1)
    Requirement already satisfied: setuptools in /anaconda/lib/python3.6/site-packages (from pytest->evol==0.1)
    Installing collected packages: evol
      Running setup.py install for evol ... done
    Successfully installed evol-0.1
    ➜  evol git:(master) ✗ python examples/example_pheromone.py
    Traceback (most recent call last):
      File "examples/example_pheromone.py", line 13, in <module>
        from evol import Population, Evolution
      File "/anaconda/lib/python3.6/site-packages/evol/__init__.py", line 2, in <module>
        from .evolution import Evolution
      File "/anaconda/lib/python3.6/site-packages/evol/evolution.py", line 3, in <module>
        from .population import Population
      File "/anaconda/lib/python3.6/site-packages/evol/population.py", line 2, in <module>
        from evol.helpers.utils import select_arguments
      File "/anaconda/lib/python3.6/site-packages/evol/helpers.py", line 6, in <module>
        from evol.population import Population
    ImportError: cannot import name 'Population'
    

    Did you get something similar? I am wondering if this is my system that is acting up.

    opened by koaning 6
  • how to do a grid search

    how to do a grid search

    Depending on the problem you're dealing with, you may want to supply the population with a predefined list of chromosomes instead of a function to generate a mere individual. This way the end user might specify a grid to start from.

    We could accomodate for this flow by adding a init_chromosomes parameter to Population.__init__ where we would check that either init_func or init_chromosomes are used.

    Any thoughts for/against this? After thinking it over it seems like the only way to do this in our current API is to first generate a population and next overwrite all the individuals.

    opened by koaning 6
  • how to deal with kwargs

    how to deal with kwargs

    def init_func():
        return 1
    
    
    def eval_func(x):
    
        return x
    
    
    def pick_two_random_parents(population):
        return random.choices(population, k=2)
    
    
    def pick_n_random_parents(population, n_parents=2):
        return random.choices(population, k=n_parents)
    
    
    def combine_two_parents(mom, dad):
        return (mom+dad)/2
    
    
    def general_combiner(*parents):
        return sum(parents)/len(parents)
    
    pop1 = Population(init_function=init_func, eval_function=eval_func, size=200)
    pop1.survive(n=50).breed(parent_picker=pick_two_random_parents, combiner=combine_two_parents)
    
    pop2 = Population(init_function=init_func, eval_function=eval_func, size=200)
    pop2.survive(n=50).breed(parent_picker=pick_n_random_parents, combiner=general_combiner)
    

    What is a nice way to change the n_parents parameter? Something like;

    pop2.survive(n=50).breed(parent_picker=pick_n_random_parents, combiner=general_combiner, n_parents=3)
    
    opened by koaning 6
  • Is it possible to implement elitism?

    Is it possible to implement elitism?

    I wrote this function and used it as a callback to attemp elitist selection without success:

    best_chromosome = None
    
    def save_best(population):
        global best_chromosome
        
        population_best = population.current_best
        
        if best_chromosome is None or population_best.fitness > best_chromosome.fitness:
            best_chromosome = deepcopy(population_best)
    

    Is there a way to do it?

    opened by ELC 16
  • CI Fix

    CI Fix

    I've made the repo flake8 compatible again and I've changed the Makefile. Flake8 raised this warning without the change;

    WARNING: flake8 setuptools integration is deprecated and scheduled for removal in 4.x.  For more information, see https://gitlab.com/pycqa/flake8/issues/544
    
    opened by koaning 0
  • WIP: Change the signature of pick_random

    WIP: Change the signature of pick_random

    Parent pickers are no longer passed any kwargs. The pick_random must now be initialised before use, the number of parents passed to it upon initialization. In addition, pickers must always return a sequence of picked parents - even if it is only one.

    These changes make it much easier to implement more complex picking algorithms, and in addition they remove the requirement for the select_arguments() decorator, which hurts my eyes.

    opened by rogiervandergeer 7
  • DiceDots: a problem for contest population

    DiceDots: a problem for contest population

    I heard of a cool puzzle.

    Suppose that you can take a cube and draw 18 dots on it such that you can make a custom dice with custom eyes on each side. You can determine which die can beat other dice so the question is, what is the most balanced die?

    This might be a fun problem to explain the ContestPopulation with and might deserve to make an appearance as a problem instance.

    opened by koaning 0
Releases(0.5.2)
Xor encryption and decryption algorithm

Folosire: Pentru encriptare: python encrypt.py parola fișier pentru criptare fișier encriptat(de tip binar) Pentru decriptare: python decrypt.p

2 Dec 05, 2021
A raw implementation of the nearest insertion algorithm to resolve TSP problems in a TXT format.

TSP-Nearest-Insertion A raw implementation of the nearest insertion algorithm to resolve TSP problems in a TXT format. Instructions Load a txt file wi

sjas_Phantom 1 Dec 02, 2021
causal-learn: Causal Discovery for Python

causal-learn: Causal Discovery for Python Causal-learn is a python package for causal discovery that implements both classical and state-of-the-art ca

589 Dec 29, 2022
Implemented page rank program

Page Rank Implemented page rank program based on fact that a website is more important if it is linked to by other important websites using recursive

Vaibhaw 6 Aug 24, 2022
Cormen-Lib - An academic tool for data structures and algorithms courses

The Cormen-lib module is an insular data structures and algorithms library based on the Thomas H. Cormen's Introduction to Algorithms Third Edition. This library was made specifically for administeri

Cormen Lib 12 Aug 18, 2022
A simple python implementation of A* and bfs algorithm solving Eight-Puzzle

A simple python implementation of A* and bfs algorithm solving Eight-Puzzle

2 May 22, 2022
Minimal pure Python library for working with little-endian list representation of bit strings.

bitlist Minimal Python library for working with bit vectors natively. Purpose This library allows programmers to work with a native representation of

Andrei Lapets 0 Jul 25, 2022
An open source algorithm and dataset for finding poop in pictures.

The shitspotter module is where I will be work on the "shitspotter" poop-detection algorithm and dataset. The primary goal of this work is to allow for the creation of a phone app that finds where yo

Jon Crall 29 Nov 29, 2022
This is an implementation of the QuickHull algorithm in Python. I

QuickHull This is an implementation of the QuickHull algorithm in Python. It randomly generates a set of points and finds the convex hull of this set

Anant Joshi 4 Dec 04, 2022
8 Puzzle with A* , Greedy & BFS Search in Python

8_Puzzle 8 Puzzle with A* , Greedy & BFS Search in Python Python Install Python from here. Pip Install pip from here. How to run? 🚀 Install 8_Puzzle

I3L4CK H4CK3l2 1 Jan 30, 2022
A library for benchmarking, developing and deploying deep learning anomaly detection algorithms

A library for benchmarking, developing and deploying deep learning anomaly detection algorithms Key Features • Getting Started • Docs • License Introd

OpenVINO Toolkit 1.5k Jan 04, 2023
PICO is an algorithm for exploiting Reinforcement Learning (RL) on Multi-agent Path Finding tasks.

PICO is an algorithm for exploiting Reinforcement Learning (RL) on Multi-agent Path Finding tasks. It is developed by the Multi-Agent Artificial Intel

21 Dec 20, 2022
There are some basic arithmatic in Pattern Recognization and Machine Learning writed in Python in this repository

There are some basic arithmatic in Pattern Recognization and Machine Learning writed in Python in this repository

1 Nov 19, 2021
A lightweight, object-oriented finite state machine implementation in Python with many extensions

transitions A lightweight, object-oriented state machine implementation in Python with many extensions. Compatible with Python 2.7+ and 3.0+. Installa

4.7k Jan 01, 2023
A Python Package for Portfolio Optimization using the Critical Line Algorithm

A Python Package for Portfolio Optimization using the Critical Line Algorithm

19 Oct 11, 2022
Programming Foundations Algorithms With Python

Programming-Foundations-Algorithms Algorithms purpose to solve a specific proplem with a sequential sets of steps for instance : if you need to add di

omar nafea 1 Nov 01, 2021
Solving a card game with three search algorithms: BFS, IDS, and A*

Search Algorithms Overview In this project, we want to solve a card game with three search algorithms. In this card game, we have to sort our cards by

Korosh 5 Aug 04, 2022
Algorithmic Trading with Python

Source code for Algorithmic Trading with Python (2020) by Chris Conlan

Chris Conlan 1.3k Jan 03, 2023
FingerPy is a algorithm to measure, analyse and monitor heart-beat using only a video of the user's finger on a mobile cellphone camera.

FingerPy is a algorithm using python, scipy and fft to measure, analyse and monitor heart-beat using only a video of the user's finger on a m

Thiago S. Brasil 37 Oct 21, 2022
iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms.

iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms. You can find its main page and description via this link. If you are familiar with NILM-TK API

Mozaffar Etezadifar 3 Mar 19, 2022