Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

Related tags

Machine Learningmlops
Overview

Federal University of Rio Grande do Norte

Technology Center

Department of Computer Engineering and Automation

Machine Learning Based Systems Design

References

  • πŸ“š Noah Gift, Alfredo Deza. Practical MLOps: Operationalizing Machine Learning Models [Link]
  • πŸ“š Chip Huyen. Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications. [Link]
  • πŸ“š Hannes Hapke, Catherine Nelson. Building Machine Learning Pipelines. [Link]
  • πŸ“š Mariano Anaya. Clean Code in Python [Link]
  • πŸ“š AurΓ©lien GΓ©ron. Hands on Machine Learning with Scikit-Learn, Keras and TensorFlow. [Link]
  • 🀜 Dataquest Academic Program [Link]
  • πŸ˜ƒ CS329S - ML Systems Design [Link]
  • 🎯 Machine Learning Operations [Link]

Lessons

Week 01: Course Outline Open in PDF

  • Git and Version Control Open in Dataquest
    • You'll learn how to: a) organize your code using version control, b) resolve conflicts in version control, c) employ Git and Github to collaborate with others.
    • πŸ‘Š U1T1: guided project + getting a git repository.

Week 02: CLI fundamentals

  • Elements of the Command Line Open in Dataquest
    • You'll learn how to: a) employ the command line for Data Science, b) modify the behavior of commands with options, c) employ glob patterns and wildcards, d) define Important command line concepts, e) navigate he filesystem, f) manage users and permissions.
  • Text Processing in the Command Line Open in Dataquest
    • You'll learn how to: a) read and explore documentation, b) perform basic text processing, c) redirect and pipe output, d) inspect files, e) define different kinds of output, f) employ streams and file descriptors.
  • πŸ”  U1T2: working with command line.

Week 03 - Clean Code Principles for Data Science and Machine Learning Open in PDF

  • Outline Open in Loom
  • Coding Best Practices Open in Loom
  • Writing Clean Code Open in Loom
  • Refactoring Code Open in Loom
  • Efficient Code Open in Loom
  • Documentation Open in Loom
  • Python Code Quality Authority (PCQA) - pycodestyle Open in Loom
  • PCQA - pylint Open in Loom
  • PCQA - autopep8 Open in Loom
  • PCQA - nbQA Open in Loom
  • ▢️ Hands on
    • πŸ’Ύ Datasets [Link]
    • Writting Clean Code Jupyter
    • Exercise 01 Jupyter
    • Exercise 02 Jupyter
    • Exercise 03 Jupyter
    • Using pycodestyle Jupyter
    • Using pylint - script Python refactored script Python
    • Functions: Advanced - Best practices for writing functions Open in Dataquest

Week 04 Production Ready Code Open in PDF

  • Outline Open in Loom
  • Catching Errors Open in Loom
  • Testing and Data Science Open in Loom
  • A brief introduction about pytest Open in Loom
  • Logging Open in Loom
  • Case study: testing and logging Open in Loom
  • Model Drift Open in Loom
  • Hands on
    • Production ready code Jupyter
    • Data Visualization Fundamentals Open in Dataquest
      • You will learn how to: a) how to use data visualization to explore data and b) how and when to use the most common plots.
    • Storytelling Data Visualization and Information Design Open in Dataquest
      • You will learn how to: a) Create graphs using information design principles, b) create narrative data visualizations using Matplotlib, c) create visual patterns using Gestalt principles, d) control attention using pre-attentive attributes and e) employ Matplotlib's built-in styles.
Owner
Ivanovitch Silva
I'm an experimenter by design, and very interested in technologies related to Data Science & Machine Learning, Vehicles and Complex Networks.
Ivanovitch Silva
A machine learning web application for binary classification using streamlit

Machine Learning web App This is a machine learning web application for binary classification using streamlit options this application contains 3 clas

abdelhak mokri 1 Dec 20, 2021
A Python package to preprocess time series

Disclaimer: This package is WIP. Do not take any APIs for granted. tspreprocess Time series can contain noise, may be sampled under a non fitting rate

Maximilian Christ 57 Dec 17, 2022
using Machine Learning Algorithm to classification AppleStore application

AppleStore-classification-with-Machine-learning-Algo- using Machine Learning Algorithm to classification AppleStore application. the first step : 1: p

Mohammed Hussien 2 May 02, 2022
pure-predict: Machine learning prediction in pure Python

pure-predict speeds up and slims down machine learning prediction applications. It is a foundational tool for serverless inference or small batch prediction with popular machine learning frameworks l

Ibotta 84 Dec 29, 2022
A simple python program which predicts the success of a movie based on it's type, actor, actress and director

Movie-Success-Prediction A simple python program which predicts the success of a movie based on it's type, actor, actress and director. The program us

Mahalinga Prasad R N 1 Dec 17, 2021
inding a method to objectively quantify skill versus chance in games, using reinforcement learning

Skill-vs-chance-games-analysis - Finding a method to objectively quantify skill versus chance in games, using reinforcement learning

Marcus Chiam 4 Nov 19, 2022
Land Cover Classification Random Forest

You can perform Land Cover Classification on Satellite Images using Random Forest and visualize the result using Earthpy package. Make sure to install the required packages and such as

Dr. Sander Ali Khowaja 1 Jan 21, 2022
A high-performance topological machine learning toolbox in Python

giotto-tda is a high-performance topological machine learning toolbox in Python built on top of scikit-learn and is distributed under the G

giotto.ai 632 Dec 29, 2022
Stats, linear algebra and einops for xarray

xarray-einstats Stats, linear algebra and einops for xarray ⚠️ Caution: This project is still in a very early development stage Installation To instal

ArviZ 30 Dec 28, 2022
Used Logistic Regression, Random Forest, and XGBoost to predict the outcome of Search & Destroy games from the Call of Duty World League for the 2018 and 2019 seasons.

Call of Duty World League: Search & Destroy Outcome Predictions Growing up as an avid Call of Duty player, I was always curious about what factors led

Brett Vogelsang 2 Jan 18, 2022
Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas.

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas. Its objective is to ex

Taylor G Smith 54 Aug 20, 2022
Built on python (Mathematical straight fit line coordinates error predictor machine learning foundational model)

Sum-Square_Error-Business-Analytical-Tool- Built on python (Mathematical straight fit line coordinates error predictor machine learning foundational m

om Podey 1 Dec 03, 2021
Python bindings for MPI

MPI for Python Overview Welcome to MPI for Python. This package provides Python bindings for the Message Passing Interface (MPI) standard. It is imple

MPI for Python 604 Dec 29, 2022
Retrieve annotated intron sequences and classify them as minor (U12-type) or major (U2-type)

(intron I nterrogator and C lassifier) intronIC is a program that can be used to classify intron sequences as minor (U12-type) or major (U2-type), usi

Graham Larue 4 Jul 26, 2022
a distributed deep learning platform

Apache SINGA Distributed deep learning system http://singa.apache.org Quick Start Installation Examples Issues JIRA tickets Code Analysis: Mailing Lis

The Apache Software Foundation 2.7k Jan 05, 2023
Factorization machines in python

Factorization Machines in Python This is a python implementation of Factorization Machines [1]. This uses stochastic gradient descent with adaptive re

Corey Lynch 892 Jan 03, 2023
Responsible AI Workshop: a series of tutorials & walkthroughs to illustrate how put responsible AI into practice

Responsible AI Workshop Responsible innovation is top of mind. As such, the tech industry as well as a growing number of organizations of all kinds in

Microsoft 9 Sep 14, 2022
MICOM is a Python package for metabolic modeling of microbial communities

Welcome MICOM is a Python package for metabolic modeling of microbial communities currently developed in the Gibbons Lab at the Institute for Systems

57 Dec 21, 2022
Kaggle Tweet Sentiment Extraction Competition: 1st place solution (Dark of the Moon team)

Kaggle Tweet Sentiment Extraction Competition: 1st place solution (Dark of the Moon team)

Artsem Zhyvalkouski 64 Nov 30, 2022
In this Repo a simple Sklearn Model will be trained and pushed to MLFlow

SKlearn_to_MLFLow In this Repo a simple Sklearn Model will be trained and pushed to MLFlow Install This Repo is based on poetry python3 -m venv .venv

1 Dec 13, 2021