A demo of Prometheus+Grafana for monitoring an ML model served with FastAPI.

Related tags

Loggingml-monitoring
Overview

ml-monitoring

Jeremy Jordan

This repository provides an example setup for monitoring an ML system deployed on Kubernetes.

Blog post: https://www.jeremyjordan.me/ml-monitoring/

Components:

  • ML model served via FastAPI
  • Export server metrics via prometheus-fastapi-instrumentator
  • Simulate production traffic via locust
  • Monitor and store metrics via Prometheus
  • Visualize metrics via Grafana

Setup

  1. Ensure you can connect to a Kubernetes cluster and have kubectl and helm installed.
    • You can easily spin up a Kubernetes cluster on your local machine using minikube.
minikube start --driver=docker --memory 4g --nodes 2
  1. Deploy Prometheus and Grafana onto the cluster using the community Helm chart.
kubectl create namespace monitoring
helm install prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring
  1. Verify the resources were deployed successfully.
kubectl get all -n monitoring
  1. Connect to the Grafana dashboard.
kubectl port-forward svc/prometheus-stack-grafana 8000:80 -n monitoring
  • Go to http://127.0.0.1:8000/
  • Log in with the credentials:
    • Username: admin
    • Password: prom-operator
    • (This password can be configured in the Helm chart values.yaml file)
  1. Import the model dashboard.
    • On the left sidebar, click the "+" and select "Import".
    • Copy and paste the JSON defined in dashboards/model.json in the text area.

Deploy a model

This repository includes an example REST service which exposes an ML model trained on the UCI Wine Quality dataset.

You can launch the service on Kubernetes by running:

kubectl apply -f kubernetes/models/

You can also build and run the Docker container locally.

docker build -t wine-quality-model -f model/Dockerfile .
docker run -d -p 3000:80 -e ENABLE_METRICS=true wine-quality-model

Note: In order for Prometheus to scrape metrics from this service, we need to define a ServiceMonitor resource. This resource must have the label release: prometheus-stack in order to be discovered. This is configured in the Prometheus resource spec via the serviceMonitorSelector attribute.

You can verify the label required by running:

kubectl get prometheuses.monitoring.coreos.com prometheus-stack-kube-prom-prometheus -n monitoring -o yaml

Simulate production traffic

We can simulate production traffic using a Python load testing tool called locust. This will make HTTP requests to our model server and provide us with data to view in the monitoring dashboard.

You can begin the load test by running:

kubectl apply -f kubernetes/load_tests/

By default, production traffic will be simulated for a duration of 5 minutes. This can be changed by updating the image arguments in the kubernetes/load_tests/locust_master.yaml manifest.

You can also modify the community Helm chart instead of using the manifests defined in this repo.

Uploading new images

This process can eventually be automated with a Github action, but remains manual for now.

  1. Obtain a personal access token to connect with the Github container registry.
echo "INSERT_TOKEN_HERE" >> ~/.github/cr_token
  1. Authenticate with the Github container registry.
cat ~/.github/cr_token | docker login ghcr.io -u jeremyjordan --password-stdin
  1. Build and tag new Docker images.
docker build -t wine-quality-model:0.3 -f model/Dockerfile .
docker tag wine-quality-model:0.3 ghcr.io/jeremyjordan/wine-quality-model:0.3
docker build -t locust-load-test:0.2 -f load_test/Dockerfile .
docker tag locust-load-test:0.2 ghcr.io/jeremyjordan/locust-load-test:0.2
  1. Push Docker images to container registery.
docker push ghcr.io/jeremyjordan/wine-quality-model:0.3
docker push ghcr.io/jeremyjordan/locust-load-test:0.2
  1. Update Kubernetes manifests to use the new image tag.

Teardown instructions

To stop the model REST server, run:

kubectl delete -f kubernetes/models/

To stop the load tests, run:

kubectl delete -f kubernetes/load_tests/

To remove the Prometheus stack, run:

helm uninstall prometheus-stack -n monitoring
Owner
Jeremy Jordan
Machine learning engineer. Broadly curious. Twitter: @jeremyjordan
Jeremy Jordan
Display tabular data in a visually appealing ASCII table format

PrettyTable Installation Install via pip: python -m pip install -U prettytable Install latest development version: python -m pip install -U git+https

Jazzband 924 Jan 05, 2023
Monitor and log Network and Disks statistics in MegaBytes per second.

iometrics Monitor and log Network and Disks statistics in MegaBytes per second. Install pip install iometrics Usage Pytorch-lightning integration from

Leo Gallucci 17 May 03, 2022
Stand-alone parser for User Access Logging from Server 2012 and newer systems

KStrike Stand-alone parser for User Access Logging from Server 2012 and newer systems BriMor Labs KStrike This script will parse data from the User Ac

BriMor Labs 69 Nov 01, 2022
A python logging library

logi v1.3.4 instolation the lib works on python 3x versions pip install logi examples import import logi log = logger(path='C:/file path', timestamp=T

2 Jul 06, 2022
Json Formatter for the standard python logger

This library is provided to allow standard python logging to output log data as json objects. With JSON we can make our logs more readable by machines and we can stop writing custom parsers for syslo

Zakaria Zajac 1.4k Jan 04, 2023
A simple, transparent, open-source key logger, written in Python, for tracking your own key-usage statistics.

A simple, transparent, open-source key logger, written in Python, for tracking your own key-usage statistics, originally intended for keyboard layout optimization.

Ga68 56 Jan 03, 2023
Discord-Image-Logger - Discord Image Logger With Python

Discord-Image-Logger A exploit I found in discord. Working as of now. Explanatio

111 Dec 31, 2022
A Python package which supports global logfmt formatted logging.

Python Logfmter A Python package which supports global logfmt formatted logging. Install $ pip install logfmter Usage Before integrating this library,

Joshua Taylor Eppinette 15 Dec 29, 2022
Debugging-friendly exceptions for Python

Better tracebacks This is a more helpful version of Python's built-in exception message: It shows more code context and the current values of nearby v

Clemens Korndörfer 1.2k Dec 28, 2022
A simple package that allows you to save inputs & outputs as .log files

wolf_dot_log A simple package that allows you to save inputs & outputs as .log files pip install wolf_dot_log pip3 install wolf_dot_log |Instructions|

Alpwuf 1 Nov 16, 2021
The easy way to send notifications

See changelog for recent changes Got an app or service and you want to enable your users to use notifications with their provider of choice? Working o

Or Carmi 2.4k Dec 25, 2022
This is a key logger based in python which when executed records all the keystrokes of the system it has been executed on .

This is a key logger based in python which when executed records all the keystrokes of the system it has been executed on

Purbayan Majumder 0 Mar 28, 2022
Command-line tool that instantly fetches Stack Overflow results when an exception is thrown

rebound Rebound is a command-line tool that instantly fetches Stack Overflow results when an exception is thrown. Just use the rebound command to exec

Jonathan Shobrook 3.9k Jan 03, 2023
Outlog it's a library to make logging a simple task

outlog Outlog it's a library to make logging a simple task!. I'm a lazy python user, the times that i do logging on my apps it's hard to do, a lot of

ZSendokame 2 Mar 05, 2022
Translating symbolicated Apple JSON format crash log into our old friends :)

CrashTranslation Translating symbolicated Apple JSON format crash log into our old friends :) Usage python3 translation.py -i {input_sybolicated_json_

Kam-To 11 May 16, 2022
pyEventLogger - a simple Python Library for making customized Logs of certain events that occur in a program

pyEventLogger is a simple Python Library for making customized Logs of certain events that occur in a program. The logs can be fully customized and can be printed in colored format or can be stored i

Siddhesh Chavan 2 Nov 03, 2022
Robust and effective logging for Python 2 and 3.

Robust and effective logging for Python 2 and 3.

Chris Hager 1k Jan 04, 2023
A Fast, Extensible Progress Bar for Python and CLI

tqdm tqdm derives from the Arabic word taqaddum (تقدّم) which can mean "progress," and is an abbreviation for "I love you so much" in Spanish (te quie

tqdm developers 23.7k Jan 01, 2023
Pretty and useful exceptions in Python, automatically.

better-exceptions Pretty and more helpful exceptions in Python, automatically. Usage Install better_exceptions via pip: $ pip install better_exception

Qix 4.3k Dec 29, 2022
Soda SQL Data testing, monitoring and profiling for SQL accessible data.

Soda SQL Data testing, monitoring and profiling for SQL accessible data. What does Soda SQL do? Soda SQL allows you to Stop your pipeline when bad dat

Soda Data Monitoring 51 Jan 01, 2023