Simple and flexible ML workflow engine.

Overview

Katana ML Skipper

PyPI - Python GitHub Stars GitHub Issues Current Version

This is a simple and flexible ML workflow engine. It helps to orchestrate events across a set of microservices and create executable flow to handle requests. Engine is designed to be configurable with any microservices. Enjoy!

Skipper

Author

Katana ML, Andrej Baranovskij

Instructions

Start/Stop

Docker Compose

Start:

docker-compose up --build -d

Stop:

docker-compose down

This will start RabbitMQ container. To run engine and services, navigate to related folders and follow instructions.

Web API FastAPI endpoint:

http://127.0.0.1:8080/api/v1/skipper/tasks/docs

Kubernetes

NGINX Ingress Controller:

If you are using local Kubernetes setup, install NGINX Ingress Controller

Build Docker images:

docker-compose -f docker-compose-kubernetes.yml build

Setup Kubernetes services:

./kubectl-setup.sh

Skipper API endpoint published through NGINX Ingress (you can setup your own host in /etc/hosts):

http://kubernetes.docker.internal/api/v1/skipper/tasks/docs

Check NGINX Ingress Controller pod name:

kubectl get pods -n ingress-nginx

Sample response, copy the name of 'Running' pod:

NAME                                       READY   STATUS      RESTARTS   AGE
ingress-nginx-admission-create-dhtcm       0/1     Completed   0          14m
ingress-nginx-admission-patch-x8zvw        0/1     Completed   0          14m
ingress-nginx-controller-fd7bb8d66-tnb9t   1/1     Running     0          14m

NGINX Ingress Controller logs:

kubectl logs -n ingress-nginx -f 
   

   

Skipper API logs:

kubectl logs -n katana-skipper -f -l app=skipper-api

Remove Kubernetes services:

./kubectl-remove.sh

Components

  • api - Web API implementation
  • workflow - workflow logic
  • services - a set of sample microservices, you should replace this with your own services. Update references in docker-compose.yml
  • rabbitmq - service for RabbitMQ broker
  • skipper-lib - reusable Python library to streamline event communication through RabbitMQ
  • logger - logger service

URLs

  • Web API
http://127.0.0.1:8080/api/v1/skipper/tasks/docs

If running on local Kubernetes with Docker Desktop:

http://kubernetes.docker.internal/api/v1/skipper/tasks/docs
  • RabbitMQ:
http://localhost:15672/ (skipper/welcome1)

If running on local Kubernets, make sure port forwarding is enabled:

kubectl -n rabbits port-forward rabbitmq-0 15672:15672
  • PyPI
https://pypi.org/project/skipper-lib/
  • OCI - deployment guide for Oracle Cloud

Usage

You can use Skipper engine to run Web API, workflow and communicate with a group of ML microservices implemented under services package.

Skipper can be deployed to any Cloud vendor with Kubernetes or Docker support. You can scale Skipper runtime on Cloud using Kubernetes commands.

License

Licensed under the Apache License, Version 2.0. Copyright 2020-2021 Katana ML, Andrej Baranovskij. Copy of the license.

Comments
  • Cache EventProducer

    Cache EventProducer

    I found that cache the EventProducer can improve performace 40%. I tried but it block may request when increase the speed test. Do you have suggest to fix that

    opened by manhtd98 7
  • Docker-compose up not working

    Docker-compose up not working

    Hi

    Thank you for the wonderful katana-skipper. I am trying to digest the library and execute the docker-compose.yml. But it seems like it is not working.

    Would appreciate it if you could take a look

    good first issue 
    opened by jamesee 6
  • Doc: How to add a new service with a new queue

    Doc: How to add a new service with a new queue

    How do we add a new service with a new queue called translator?

    1. I add a new router adding a new path for my new service defining a new prefix and tag named translator.
    2. I create a new request model for my new service in models.py containing task_type and expect a type translator and a payload
    3. I define a new service container with the correct variables and set my SERVICE=translator and QUEUE_NAME=skipper_translator

    I am able to call the new endpoint and it returns:

    task_id: "-", 
    task_status: "Success", 
    outcome: "<starlette.responses.JSONResponse object at 0x7ff2672dbed0>"
    

    However the container is never triggered.

    What am I missing?

    opened by ladrua 4
  • The difference between event_producer and exchange_producer

    The difference between event_producer and exchange_producer

    Hello, Thanks for sharing your ML workflow. I appreciate if you could explain the difference between event_producer and exchange_producer. event_producer is used to produce an event to rabbitmq, but exchange_producer is not clear to me. Can't we use event_producer in place of exchange_producer?

    good first issue 
    opened by fadishaar84 4
  • Encountering Authentication Issues

    Encountering Authentication Issues

    When I run the start command on docker I get the following error in the data-service container. Would greatly appreciate guidance on how to fix this issue. ` data-service katanaml/data-service RUNNING

    Traceback (most recent call last):

    File "main.py", line 19, in

    main()
    

    File "main.py", line 15, in main

    'http://127.0.0.1:5001/api/v1/skipper/logger/log_receiver'))
    

    File "/usr/local/lib/python3.7/site-packages/skipper_lib/events/event_receiver.py", line 16, in init

    credentials=credentials))
    

    File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 360, in init

    self._impl = self._create_connection(parameters, _impl_class)
    

    File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 451, in _create_connection

    raise self._reap_last_connection_workflow_error(error)
    

    pika.exceptions.AMQPConnectionError

    Traceback (most recent call last):

    File "main.py", line 19, in

    main()
    

    File "main.py", line 15, in main

    'http://127.0.0.1:5001/api/v1/skipper/logger/log_receiver'))
    

    File "/usr/local/lib/python3.7/site-packages/skipper_lib/events/event_receiver.py", line 16, in init

    credentials=credentials))
    

    File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 360, in init

    self._impl = self._create_connection(parameters, _impl_class)
    

    File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 451, in _create_connection

    raise self._reap_last_connection_workflow_error(error)
    

    pika.exceptions.ProbableAuthenticationError: ConnectionClosedByBroker: (403) 'ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfi`

    opened by LM-01 3
  • How can we move from docker compose to kubernetes?

    How can we move from docker compose to kubernetes?

    Hello Andrej, I would like to ask about how to move from docker-compose to Kubernetes, do we have to use some tools like kompose or other tools, I appreciate if you could guide me a little bit about how to perform this conversion to run our services on Skipper not using docker compose but kubernetes. Thank you.

    opened by fadishaar84 2
Releases(v1.1.0)
  • v1.1.0(Dec 11, 2021)

    This release of Katana ML Skipper includes:

    • Skipper Lib JS - support for Node.js containers
    • Error handling
    • Configurable FastAPI endpoints
    • Various improvements and bug fixes

    What's Changed

    • (README.md) Adding Andrej's profile url by @xandrade in https://github.com/katanaml/katana-skipper/pull/3

    New Contributors

    • @xandrade made their first contribution in https://github.com/katanaml/katana-skipper/pull/3

    Full Changelog: https://github.com/katanaml/katana-skipper/compare/v1.0.0...v1.1.0

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Oct 9, 2021)

    First production release of Katana ML Skipper.

    Included:

    • Logger
    • Workflow
    • API async and sync
    • Services
    • Docker support
    • Kubernetes support
    • Tested on OCI Cloud

    Full Changelog: https://github.com/katanaml/katana-skipper/commits/v1.0.0

    Source code(tar.gz)
    Source code(zip)
Owner
Katana ML
Machine Learning for Business Automation
Katana ML
This is the code repository for LRM Stochastic watershed model.

LRM-Squannacook Input data for generating stochastic streamflows are observed and simulated timeseries of streamflow. their format needs to be CSV wit

1 Feb 14, 2022
Machine Learning approach for quantifying detector distortion fields

DistortionML Machine Learning approach for quantifying detector distortion fields. This project is a feasibility study for training a surrogate model

Joel Bernier 1 Nov 05, 2021
[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark

TensorFrames (Deprecated) Note: TensorFrames is deprecated. You can use pandas UDF instead. Experimental TensorFlow binding for Scala and Apache Spark

Databricks 757 Dec 31, 2022
A Time Series Library for Apache Spark

Flint: A Time Series Library for Apache Spark The ability to analyze time series data at scale is critical for the success of finance and IoT applicat

Two Sigma 970 Jan 04, 2023
Projeto: Machine Learning: Linguagens de Programacao 2004-2001

Projeto: Machine Learning: Linguagens de Programacao 2004-2001 Projeto de Data Science e Machine Learning de análise de linguagens de programação de 2

Victor Hugo Negrisoli 0 Jun 29, 2021
Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining

**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.** S

Sebastian Raschka 4k Dec 30, 2022
TensorFlow implementation of an arbitrary order Factorization Machine

This is a TensorFlow implementation of an arbitrary order (=2) Factorization Machine based on paper Factorization Machines with libFM. It supports: d

Mikhail Trofimov 785 Dec 21, 2022
SIMD-accelerated bitwise hamming distance Python module for hexidecimal strings

hexhamming What does it do? This module performs a fast bitwise hamming distance of two hexadecimal strings. This looks like: DEADBEEF = 1101111010101

Michael Recachinas 12 Oct 14, 2022
Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

Parallelized symbolic regression built on Julia, and interfaced by Python. Uses regularized evolution, simulated annealing, and gradient-free optimization.

Miles Cranmer 924 Jan 03, 2023
Extreme Learning Machine implementation in Python

Python-ELM v0.3 --- ARCHIVED March 2021 --- This is an implementation of the Extreme Learning Machine [1][2] in Python, based on scikit-learn. From

David C. Lambert 511 Dec 20, 2022
A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

The Alan Turing Institute 6k Jan 06, 2023
Dive into Machine Learning

Dive into Machine Learning Hi there! You might find this guide helpful if: You know Python or you're learning it 🐍 You're new to Machine Learning You

Michael Floering 11.1k Jan 03, 2023
Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion is a Python library for time series intelligence. It provides an end-to-end machine learning framework that includes loading and transforming data, building and training models, post-processi

Salesforce 2.8k Jan 05, 2023
TorchDrug is a PyTorch-based machine learning toolbox designed for drug discovery

A powerful and flexible machine learning platform for drug discovery

MilaGraph 1.1k Jan 08, 2023
Falken provides developers with a service that allows them to train AI that can play their games

Falken provides developers with a service that allows them to train AI that can play their games. Unlike traditional RL frameworks that learn through rewards or batches of offline training, Falken is

Google Research 223 Jan 03, 2023
Learning --> Numpy January 2022 - winter'22

Numerical-Python Numpy NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along

Shahzaneer Ahmed 0 Mar 12, 2022
Python based GBDT implementation

Py-boost: a research tool for exploring GBDTs Modern gradient boosting toolkits are very complex and are written in low-level programming languages. A

Sberbank AI Lab 20 Sep 21, 2022
customer churn prediction prevention in telecom industry using machine learning and survival analysis

Telco Customer Churn Prediction - Plotly Dash Application Description This dash application allows you to predict telco customer churn using machine l

Benaissa Mohamed Fayçal 3 Nov 20, 2021
Bayesian Additive Regression Trees For Python

BartPy Introduction BartPy is a pure python implementation of the Bayesian additive regressions trees model of Chipman et al [1]. Reasons to use BART

187 Dec 16, 2022
Azure MLOps (v2) solution accelerators.

Azure MLOps (v2) solution accelerator Welcome to the MLOps (v2) solution accelerator repository! This project is intended to serve as the starting poi

Microsoft Azure 233 Jan 01, 2023