Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)

Overview

Using DVC with PyCaret & FastAPI (Demo)

This repo contains all the resources for my demo explaining how to use DVC along with other interesting tools & frameworks like PyCaret & FastAPI for data & model versioning, experimentation with ML models & finally deploying these models quickly for inferencing.

This demo was presented at the DVC Office Hours on 20th Jan 2022.

Note: We will use Azure Blob Storage as our remote storage for this demo. To follow along, it is advised to either create an Azure account or use a different remote for storage.


Steps Followed for the Demo

0. Preliminaries

Create a virtual environment named dvc-demo & install required packages

python3 -m venv dvc-demo
source dvc-demo/bin/activate

pip install dvc[azure] pycaret fastapi uvicorn python-multipart

Initialize the repo with DVC tracking & create a data/ folder

mkdir dvc-pycaret-fastapi-demo
cd dvc-pycaret-fastapi-demo
git init
dvc init

git remote add origin https://github.com/tezansahu/dvc-pycaret-fastapi-demo.git

mkdir data

1. Tracking Data with DVC

We use the Heart Failure Prediction Dataset for this demo.

First, we download the heart.csv file & retain ~800 rows from this file in the data/ folder. (We will use the file with all the rows later - this is to simulate the change/increase in data that an ML workflow sees during its lifetime)

Track this data/heart.csv using DVC

dvc add data/heart.csv
git add data/heart.csv.dvc
git commit -m "add data - phase 1"

2. Setup the Remote for Storing Tracked Data & Models

  • Go to the Azure Portal & create a Storage Account (here, we name it dvcdemo) Creating a Storage Account on Azure

  • Within the storage account, create a Container (here, we name it demo20jan2022)

  • Obtain the Connection String from the storage account as follows: Obtaining the Connection String for a Storage Account on Azure

  • Install the Azure CLI from here & log into Azure from within the terminal using az login

Now, we store the tracked data in Azure:

dvc remote add -d storage azure://demo20jan2022/dvcstore
dvc remote modify --local storage connection_string <connection-string>

dvc push
git push origin main

3. ML Experimentation with PyCaret

Create the notebooks/ folders using mkdir notebook & download the notebooks/experimentation_with_pycaret.ipynb notebook from this repo into this notebooks/ folder.

Track this notebook with Git:

git add notebooks/
git commit -m "add ml training notebook"

Run all the cells mentioned under Phase 1 in the notebook. This involves basics of PyCaret:

  • Setting up a vanilla experiment with setup()
  • Comparing various classification models with compare_models()
  • Evaluating the preformance a model with evaluate_model()
  • Making predictions on the held-out eval data using predict_model()
  • Finalizing the model by training on the full training + eval data using finalize_model()
  • Saving the model pipeline using save_model()

This will create a model.pkl file in the models/ folder

4. Tracking Models with DVC

Now, we track the ML model using DVC & store it in our remote storage

dvc add models/model.pkl
git add models/model.pkl.dvc
git commit -m "add model - phase 1"

dvc push
git push origin main

5. Deploy the Model with FastAPI

First, delete the .dvc/cache/ & models/model.pkl (simulate production env). Then, pull the changes from the DVC remote storage.

dvc pull

Check that the model.pkl file is now present in models/ folder.

Now, create a server/ folder & place the main.py file in it after downloaidng the server/main.py file from this repo. This RESTful API server has 2 POST endpoints:

  • Inferencing on an individual record
  • Batch inferencing on a CSV file

We commit this to our repo:

git add server/
git commit -m "create basic fastapi server"

Now, we can run our local server on port 8000

cd server
uvicorn main:app --port=8000

Go to http://localhost:8000/docs & play with the endpoints present in the interactive documentation.

Swagger Interactive API Documentation for our Server

For the individual inference, you could use teh following data:

{
  "Age": 61,
  "Sex": "M",
  "ChestPainType": "ASY",
  "RestingBP": 148,
  "Cholesterol": 203,
  "FastingBS": 0,
  "RestingECG": "Normal",
  "MaxHR": 161,
  "ExerciseAngina": "N",
  "Oldpeak": 0,
  "ST_Slope": "Up"
}

6. Simulating the arrival of New Data

Now, we use the full heart.csv file to simulate the arrival of new data with time. We place it within data/ folder & upload it to DVC remote.

dvc add data/heart.csv
git add data/heart.csv.dvc
git commit -m "add data - phase 2"

dvc push
git push origin main

7. More Experimentation with PyCaret

Now, we run the experiment in Phase 2 of the notebooks/experimentation_with_pycaret.ipynb notebook. This involves:

  • Feature engineering while setting up teh experient
  • Fine-tuning of models with tune_model()
  • Creating an ensemble of models with blend_models()

The blended model is saved as models/modl.pkl

We upload it to our DVC remote.

dvc add models/model.pkl
git add models/model.pkl.dvc
git commit -m "add model - phase 2"

dvc push
git push origin main

8. Redeploying the New Model using FastAPI

Now, we again start the server (no code changes required, because the model file has same name) & perform inference.

cd server
uvicorn main:app --port=8000

With this, we demonstrate how DVC can be used in conjunction with PyCaret & FastAPI for iterating & experimenting efficiently with ML models & deploying them with minimal effort.


Additional Resources


Created with ❤️ by Tezan Sahu

Owner
Tezan Sahu
Data & Applied Scientist at Microsoft with a keen interest in NLP, Deep Learning, Blockchain Technologies & Data Analytics.
Tezan Sahu
Flask-vs-FastAPI - Understanding Flask vs FastAPI Web Framework. A comparison of two different RestAPI frameworks.

Flask-vs-FastAPI Understanding Flask vs FastAPI Web Framework. A comparison of two different RestAPI frameworks. IntroductionIn Flask is a popular mic

Mithlesh Navlakhe 1 Jan 01, 2022
Instrument your FastAPI app

Prometheus FastAPI Instrumentator A configurable and modular Prometheus Instrumentator for your FastAPI. Install prometheus-fastapi-instrumentator fro

Tim Schwenke 441 Jan 05, 2023
Simple notes app backend using Python's FastAPI framework.

my-notes-app Simple notes app backend using Python's FastAPI framework. Route "/": User login (GET): return 200, list of all of their notes; User sign

José Gabriel Mourão Bezerra 2 Sep 17, 2022
Basic fastapi blockchain - An api based blockchain with full functionality

Basic fastapi blockchain - An api based blockchain with full functionality

1 Nov 27, 2021
fastapi-cache is a tool to cache fastapi response and function result, with backends support redis and memcached.

fastapi-cache Introduction fastapi-cache is a tool to cache fastapi response and function result, with backends support redis, memcache, and dynamodb.

long2ice 551 Jan 08, 2023
cookiecutter template for web API with python

Python project template for Web API with cookiecutter What's this This provides the project template including minimum test/lint/typechecking package

Hitoshi Manabe 4 Jan 28, 2021
Keepalive - Discord Bot to keep threads from expiring

keepalive Discord Bot to keep threads from expiring Installation Create a new Di

Francesco Pierfederici 5 Mar 14, 2022
Simple example of FastAPI + Celery + Triton for benchmarking

You can see the previous work from: https://github.com/Curt-Park/producer-consumer-fastapi-celery https://github.com/Curt-Park/triton-inference-server

Jinwoo Park (Curt) 37 Dec 29, 2022
fastapi-mqtt is extension for MQTT protocol

fastapi-mqtt MQTT is a lightweight publish/subscribe messaging protocol designed for M2M (machine to machine) telemetry in low bandwidth environments.

Sabuhi 144 Dec 28, 2022
FastAPI IPyKernel Sandbox

FastAPI IPyKernel Sandbox This repository is a light-weight FastAPI project that is meant to provide a wrapper around IPyKernel interactions. It is in

Nick Wold 2 Oct 25, 2021
FastAPI Socket.io with first-class documentation using AsyncAPI

fastapi-sio Socket.io FastAPI integration library with first-class documentation using AsyncAPI The usage of the library is very familiar to the exper

Marián Hlaváč 9 Jan 02, 2023
API Simples com python utilizando a biblioteca FastApi

api-fastapi-python API Simples com python utilizando a biblioteca FastApi Para rodar esse script são necessárias duas bibliotecas: Fastapi: Comando de

Leonardo Grava 0 Apr 29, 2022
Lazy package to start your project using FastAPI✨

Fastapi-lazy 🦥 Utilities that you use in various projects made in FastAPI. Source Code: https://github.com/yezz123/fastapi-lazy Install the project:

Yasser Tahiri 95 Dec 29, 2022
FastAPI pagination

FastAPI Pagination Installation # Basic version pip install fastapi-pagination # All available integrations pip install fastapi-pagination[all] Avail

Yurii Karabas 561 Jan 07, 2023
FastAPI-Amis-Admin is a high-performance, efficient and easily extensible FastAPI admin framework. Inspired by django-admin, and has as many powerful functions as django-admin.

简体中文 | English 项目介绍 FastAPI-Amis-Admin fastapi-amis-admin是一个拥有高性能,高效率,易拓展的fastapi管理后台框架. 启发自Django-Admin,并且拥有不逊色于Django-Admin的强大功能. 源码 · 在线演示 · 文档 · 文

AmisAdmin 318 Dec 31, 2022
FastAPI CRUD template using Deta Base

Deta Base FastAPI CRUD FastAPI CRUD template using Deta Base Setup Install the requirements for the CRUD: pip3 install -r requirements.txt Add your D

Sebastian Ponce 2 Dec 15, 2021
Sample-fastapi - A sample app using Fastapi that you can deploy on App Platform

Getting Started We provide a sample app using Fastapi that you can deploy on App

Erhan BÜTE 2 Jan 17, 2022
Learn to deploy a FastAPI application into production DigitalOcean App Platform

Learn to deploy a FastAPI application into production DigitalOcean App Platform. This is a microservice for our Try Django 3.2 project. The goal is to extract any and all text from images using a tec

Coding For Entrepreneurs 59 Nov 29, 2022
Prometheus exporter for Starlette and FastAPI

starlette_exporter Prometheus exporter for Starlette and FastAPI. The middleware collects basic metrics: Counter: starlette_requests_total Histogram:

Steve Hillier 225 Jan 05, 2023
Dead simple CSRF security middleware for Starlette ⭐ and Fast API ⚡

csrf-starlette-fastapi Dead simple CSRF security middleware for Starlette ⭐ and Fast API ⚡ Will work with either a input type="hidden" field or ajax

Nathaniel Sabanski 9 Nov 20, 2022