Free MLOps course from DataTalks.Club

Overview

MLOps Zoomcamp

Our MLOps Zoomcamp course

Overview

Objective

Teach practical aspects of productionizing ML services — from collecting requirements to model deployment and monitoring.

Target audience

Data scientists and ML engineers. Also software engineers and data engineers interested in learning about putting ML in production.

Pre-requisites

  • Python
  • Docker
  • Being comfortable with command line
  • Prior exposure to machine learning (at work or from other courses, e.g. from ML Zoomcamp)
  • Prior programming experience (at least 1+ year)

Timeline

Course start: 16 of May

Syllabus

This is a draft and will change.

Module 1: Introduction

  • What is MLOps
  • MLOps maturity model
  • Running example: NY Taxi trips dataset
  • Why do we need MLOps
  • Course overview
  • Environment preparation
  • Homework

More details

Module 2: Experiment tracking and model management

  • Experiment tracking intro
  • Getting started with MLflow
  • Experiment tracking with MLflow
  • Saving and loading models with MLflow
  • Model registry
  • MLflow in practice
  • Homework

More details

Module 3: Orchestration and ML Pipelines

  • ML Pipelines: introduction
  • Prefect
  • Turning a notebook into a pipeline
  • Kubeflow Pipelines
  • Homework

Module 4: Model Deployment

  • Batch vs online
  • For online: web services vs streaming
  • Serving models in Batch mode
  • Web services
  • Streaming (Kinesis/SQS + AWS Lambda)
  • Homework

Module 5: Model Monitoring

  • ML monitoring vs software monitoring
  • Data quality monitoring
  • Data drift / concept drift
  • Batch vs real-time monitoring
  • Tools: Evidently, Prometheus and Grafana
  • Homework

Module 6: Best Practices

  • Devops
  • Virtual environments and Docker
  • Python: logging, linting
  • Testing: unit, integration, regression
  • CI/CD (github actions)
  • Infrastructure as code (terraform, cloudformation)
  • Cookiecutter
  • Makefiles
  • Homework

Module 7: Processes

  • CRISP-DM, CRISP-ML
  • ML Canvas
  • Data Landscape canvas
  • MLOps Stack Canvas
  • Documentation practices in ML projects (Model Cards Toolkit)

Project

  • End-to-end project with all the things above

Running example

To make it easier to connect different modules together, we’d like to use the same running example throughout the course.

Possible candidates:

Instructors

  • Larysa Visengeriyeva
  • Cristian Martinez
  • Kevin Kho
  • Theofilos Papapanagiotou
  • Alexey Grigorev
  • Emeli Dral
  • Sejal Vaidya

Other courses from DataTalks.Club:

FAQ

I want to start preparing for the course. What can I do?

If you haven't used Flask or Docker

If you have no previous experience with ML

  • Check Module 1 from ML Zoomcamp for an overview
  • Module 3 will also be helpful if you want to learn Scikit-Learn (we'll use it in this course)
  • We'll also use XGBoost. You don't have to know it well, but if you want to learn more about it, refer to module 6 of ML Zoomcamp

I registered but haven't received an invite link. Is it normal?

Yes, we haven't automated it. You'll get a mail from us eventually, don't worry.

If you want to make sure you don't miss anything:

Is it going to be live?

No and yes. There will be two parts:

  • Lectures: Pre-recorded, you can watch them when it's convenient for you.
  • Office hours: Live on Mondays (17:00 CET), but recorded, so you can watch later.

Supporters and partners

Thanks to the course sponsors for making it possible to create this course

Thanks to our friends for spreading the word about the course

Owner
DataTalksClub
The place to talk about data
DataTalksClub
GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

Generator of Rad Names from Decent Paper Acronyms

264 Nov 08, 2022
About Solve CTF offline disconnection problem - based on python3's small crawler

About Solve CTF offline disconnection problem - based on python3's small crawler, support keyword search and local map bed establishment, currently support Jianshu, xianzhi,anquanke,freebuf,seebug

天河 32 Oct 25, 2022
Add built-in support for quaternions to numpy

Quaternions in numpy This Python module adds a quaternion dtype to NumPy. The code was originally based on code by Martin Ling (which he wrote with he

Mike Boyle 531 Dec 28, 2022
LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRerank, Seq2Slate.

LibRerank LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRer

126 Dec 28, 2022
SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.

SageMaker Python SDK SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the S

Amazon Web Services 1.8k Jan 01, 2023
BioPy is a collection (in-progress) of biologically-inspired algorithms written in Python

BioPy is a collection (in-progress) of biologically-inspired algorithms written in Python. Some of the algorithms included are mor

Jared M. Smith 40 Aug 26, 2022
Laporan Proyek Machine Learning - Azhar Rizki Zulma

Laporan Proyek Machine Learning - Azhar Rizki Zulma Project Overview Domain proyek yang dipilih dalam proyek machine learning ini adalah mengenai hibu

Azhar Rizki Zulma 6 Mar 12, 2022
PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows.

An open-source, low-code machine learning library in Python 🚀 Version 2.3.5 out now! Check out the release notes here. Official • Docs • Install • Tu

PyCaret 6.7k Jan 08, 2023
STUMPY is a powerful and scalable Python library for computing a Matrix Profile, which can be used for a variety of time series data mining tasks

STUMPY STUMPY is a powerful and scalable library that efficiently computes something called the matrix profile, which can be used for a variety of tim

TD Ameritrade 2.5k Jan 06, 2023
BudouX is the successor to Budou, the machine learning powered line break organizer tool.

BudouX Standalone. Small. Language-neutral. BudouX is the successor to Budou, the machine learning powered line break organizer tool. It is standalone

Google 868 Jan 05, 2023
Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion is a Python library for time series intelligence. It provides an end-to-end machine learning framework that includes loading and transforming data, building and training models, post-processi

Salesforce 2.8k Jan 05, 2023
An open-source library of algorithms to analyse time series in GPU and CPU.

An open-source library of algorithms to analyse time series in GPU and CPU.

Shapelets 216 Dec 30, 2022
A toolkit for geo ML data processing and model evaluation (fork of solaris)

An open source ML toolkit for overhead imagery. This is a beta version of lunular which may continue to develop. Please report any bugs through issues

Ryan Avery 4 Nov 04, 2021
Greykite: A flexible, intuitive and fast forecasting library

The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.

LinkedIn 1.4k Jan 15, 2022
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Jan 03, 2023
Code Repository for Machine Learning with PyTorch and Scikit-Learn

Code Repository for Machine Learning with PyTorch and Scikit-Learn

Sebastian Raschka 1.4k Jan 03, 2023
machine learning model deployment project of Iris classification model in a minimal UI using flask web framework and deployed it in Azure cloud using Azure app service

This is a machine learning model deployment project of Iris classification model in a minimal UI using flask web framework and deployed it in Azure cloud using Azure app service. We initially made th

Krishna Priyatham Potluri 73 Dec 01, 2022
Machine Learning approach for quantifying detector distortion fields

DistortionML Machine Learning approach for quantifying detector distortion fields. This project is a feasibility study for training a surrogate model

Joel Bernier 1 Nov 05, 2021
Distributed Computing for AI Made Simple

Project Home Blog Documents Paper Media Coverage Join Fiber users email list Uber Open Source 997 Dec 30, 2022

Land Cover Classification Random Forest

You can perform Land Cover Classification on Satellite Images using Random Forest and visualize the result using Earthpy package. Make sure to install the required packages and such as

Dr. Sander Ali Khowaja 1 Jan 21, 2022