Machine Learning Systems Design

Read this booklet here.

This booklet covers four main steps of designing a machine learning system:

Project setup
Data pipeline
Modeling: selecting, training, and debugging
Serving: testing, deploying, and maintaining

It comes with links to practical resources that explain each aspect in more details. It also suggests case studies written by machine learning engineers at major tech companies who have deployed machine learning systems to solve real-world problems.

At the end, the booklet contains 27 open-ended machine learning systems design questions that might come up in machine learning interviews. The answers for these questions will be published in the book Machine Learning Interviews. You can look at and contribute to community answers to these questions on GitHub here. You can read more about the book and sign up for the book's mailing list here.

Contribute

This is work-in-progress so any type of contribution is very much appreciated. Here are a few ways you can contribute:

Improve the text by fixing any lexical, grammatical, or technical error
Add more relevant resources to each aspect of the machine learning project flow
Add/edit questions
Add/edit answers
Other

This book was created using the wonderful magicbook package. For detailed instructions on how to use the package, see their GitHub repo. The package requires that you have node. If you're on Mac, you can install node using:

brew install node

Install magicbook with:

npm install magicbook

Clone this repository:

git clone https://github.com/chiphuyen/machine-learning-systems-design.git
cd machine-learning-systems-design

After you've made changes to the content in the content folder, you can build the booklet by the following steps:

magicbook build

You'll find the generated HTML and PDF files in the folder build.

Acknowledgment

I'd like to thank Ben Krause for being a great friend and helping me with this draft!

A booklet on machine learning systems design with exercises

Related tags

Overview

Machine Learning Systems Design

Contribute

Acknowledgment

Citation

Owner

Chip Huyen

Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

How the Deep Q-learning method works and discuss the new ideas that makes the algorithm work

Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

neural image generation

Minimal deep learning library written from scratch in Python, using NumPy/CuPy.

Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)

Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Official PyTorch implementation of the NeurIPS 2021 paper StyleGAN3

Dungeons and Dragons randomized content generator

Using multidimensional LSTM neural networks to create a forecast for Bitcoin price

This repository provides an unified frameworks to train and test the state-of-the-art few-shot font generation (FFG) models.

Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

SimpleDepthEstimation - An unified codebase for NN-based monocular depth estimation methods

Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

Open source Python module for computer vision

KAPAO is an efficient multi-person human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

StyleGAN2 - Official TensorFlow Implementation

Over9000 optimizer