A simple guide to MLOps through ZenML and its various integrations.

Last update: Dec 27, 2022

Overview

ZenBytes

Join our

Slack Community and become part of the ZenML family

Give the main ZenML repo a

GitHub star to show your love

ZenBytes is a series of practical lessons about MLOps through ZenML and its various integrations. It is intended for people looking to learn about MLOps generally, and also practitioners specifically looking to learn more about ZenML.

🙏 About ZenML

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows. The ZenML repository and Docs has more details.

ZenML is a good tool to learn MLOps because of two reasons:

🔹 ZenML focuses on being un-opinionated about underlying tooling and infrastructure across the MLOps stack. 🔹 ZenML presents itself as a pipeline tool, making all development in ZenML data-centric rather than model-centric.

🧱 Structure of Lessons

The lessons are structured in Chapters. Each chapter is a notebook that walks through and explains various concepts:

Chapter 0: Basics
Chapter 1: Building a ML(Ops) pipeline
Chapter 2: Transitioning across stacks
Coming soon: More chapters

💻 System Requirements

In order to run these lessons, you need to have some packages installed on your machine. Note you only need these for some parts, and you might get away with only Python and pip install requirements.txt for some parts of the codebase, but we recommend installing all these:

Currently, this will only run on UNIX systems.

package	MacOS installation	Linux installation
docker	Docker Desktop for Mac	Docker Engine for Linux
kubectl	kubectl for mac	kubectl for linux
k3d	Brew Installation of k3d	k3d installation linux

You might also need to install Anaconda to get the MLflow deployment to work.

🐍 Python Requirements

Once you've got the system requirements figured out, let's jump into the Python packages you need. Within the Python environment of your choice, run:

git clone https://github.com/zenml-io/zenbytes
pip install -r requirements.txt

If you are running the run.py script, you will also need to install some integrations using zenml:

zenml integration install sklearn -f
zenml integration install dash -f
zenml integration install evidently -f
zenml integration install mlflow -f
zenml integration install kubeflow -f
zenml integration install seldon -f

📓 Diving into the code

We're ready to go now. You can go through the notebook step-by-step guide:

jupyter notebook

🏁 Cleaning up when you're done

Once you are done running all notebooks you might want to stop all running processes. For this, run the following command. (This will tear down your k3d cluster and the local docker registry.)

zenml stack set aws_kubeflow_stack
zenml stack down -f
zenml stack set local_kubeflow_stack
zenml stack down -f

❓ FAQ

MacOS When starting the container registry for Kubeflow, I get an error about port 5000 not being available. OSError: [Errno 48] Address already in use

Solution: In order for Kubeflow to run, the docker container registry currently needs to be at port 5000. MacOS, however, uses port 5000 for the Airplay receiver. Here is a guide on how to fix this Freeing up port 5000.

A simple guide to MLOps through ZenML and its various integrations.

Related tags

Overview

ZenBytes

🙏 About ZenML

🧱 Structure of Lessons

💻 System Requirements

🐍 Python Requirements

📓 Diving into the code

🏁 Cleaning up when you're done

❓ FAQ

Owner

ZenML

Accelerating model creation and evaluation.

Timeseries analysis for neuroscience data

Scikit-Learn useful pre-defined Pipelines Hub

This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you ask it.

A machine learning toolkit dedicated to time-series data

机器学习检测webshell

Provide an input CSV and a target field to predict, generate a model + code to run it.

Machine learning algorithms implementation

Send rockets to Mars with artificial intelligence(Genetic algorithm) in python.

Datetimes for Humans™

A simple example of ML classification, cross validation, and visualization of feature importances

Tools for diffing and merging of Jupyter notebooks.

Adaptive: parallel active learning of mathematical functions

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.

Laporan Proyek Machine Learning - Azhar Rizki Zulma

PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows.

A single Python file with some tools for visualizing machine learning in the terminal.

Highly interpretable classifiers for scikit learn, producing easily understood decision rules instead of black box models

Getting Profit and Loss Make Easy From Binance