Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

Overview

Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

This is a full project of image segmentation using the model built with U-Net Algorithm on Carvana competition Dataset from Kaggle using Sagemaker as Udacity's ML Nanodegree Capstone Project.

Image Segmentation with U-Net Algorithm

Use AWS Sagemaker to train the model built with U-Net algorithm/architecture that can perform image segmentation on Carvana Dataset from Kaggle Competition.

Project Set Up and Installation

Enter AWS through the gateway and create a Sagemaker notebook instance of your choice, ml.t2.medium is a sweet spot for this project as we will not use the GPU in the notebook and will use the Sagemaker Container to train the model. Wait for the instance to launch and then create a jupyter notebook with conda_pytorch_latest_p36 kernel, this comes preinstalled with the needed modules related to pytorch we will use along the project. Set up your sagemaker roles and regions.

Dataset

We use the Carvana Dataset from Kaggle Competition to use as data for the model training job. To get the Dataset. Register or Login to your Kaggle account, create new api in the user setting and get the api key and put it in the root of your sagemaker environment root location. After that !kaggle competitions download carvana-image-masking-challenge -f train.zip and !kaggle competitions download carvana-image-masking-challenge -f train_masks.zip will download the necessary files to your notebook environment. We will then unzip the data, upload it to S3 bucket with !aws s3 sync command.

Script Files used

  1. hpo.py for hyperparameter tuning jobs where we train the model for multiple time with different hyperparameters and search for the best combination based on loss metrics.
  2. training.py for the final training of the model with the best parameters getting from the previous tuning jobs, and put debug and profiler hooks for debugging purpose and get the tensors emits during training.
  3. inference.py for using the trained model as inference and pre-processing and serializing the data before it passes to the model for segmentaion. Now this can be used locally and user friendly
  4. Note at this time, the sagemaker endpoint has an error and can't make prediction, so I have managed to create a new instance in sagemaker(ml.g4dn.xlarge to utilize the GPU) and used endpoint_local.ipynb notebook to get the inference result.
  5. requirements.txt is use to install the dependencies in the training container, these include Albumentations, higher version of torch dependencies to utilize in the training script.

Hyperparameter Tuning

I used U-Net Algorithm to create an image segmentation model. The hyperparameter searchspaces are learning-rate, number of epochs and batchsize. Note The batch size over 128(inclusive) can't be used as the GPU memory may run out during the training. Deploy a hyperparameter tuning job on sagemaker and wait for the combination of hyperparameters turn out with best metric.

hyperparameter tuning job

We pick the hyperparameters from the best training job to train the final model.

best job's hyperparameters

Debugging and Profiling

The Debugger Hook is set to record the Loss Criterion of the process in both training and validation/testing. The Plot of the Dice Coefficient is shown below.

Dice Coefficient

we can see that the validation plot is high and this means that our model had entered a state of overtraining. We can reduce this by adding dropout or L1 L2 regularization, or added more different training data, or can early stop the model before it overfit. by adding the metric definition, I could also managed to get the average accuracy and loss dat during the validation phase in AWS Cloudwatch(a powerful too to monitor your metrics of any kind). Metrics

Results

Result is pretty good, as I was using ml.g4dn.xlarge to utilize the GPU of the instance, both the hpo jobs and training job did't take too much time.

Inferenceing your data

Sagemaker Endpoint got an 500 status code error so I tried using another sagemaker instance with GPU(ml.g4dn.xlarge) and running the endpoint_local.ipynb will get you the desired output of your choice. Result

Thank You So Much For Your Time! Please don't hesitate to contribute.

Ref: Github repo of neirinzaralwin

Owner
Htin Aung Lu
I am a Machine Learning enginner. I like to work on various machine learning projects. I have more experience on @AWS @Sagemaker platform than other.
Htin Aung Lu
Content shared at DS-OX Meetup

Streamlit-Projects Streamlit projects available in this repo: An introduction to Streamlit presented at DS-OX (Feb 26, 2020) meetup Streamlit 101 - Ja

Arvindra 69 Dec 23, 2022
This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

TecoGAN This repository contains source code and materials for the TecoGAN project, i.e. code for a TEmporally COherent GAN for video super-resolution

Nils Thuerey 5.2k Jan 02, 2023
Collections for the lasted paper about multi-view clustering methods (papers, codes)

Multi-View Clustering Papers Collections for the lasted paper about multi-view clustering methods (papers, codes). There also exists some repositories

Andrew Guan 10 Sep 20, 2022
PoolFormer: MetaFormer is Actually What You Need for Vision

PoolFormer: MetaFormer is Actually What You Need for Vision (arXiv) This is a PyTorch implementation of PoolFormer proposed by our paper "MetaFormer i

Sea AI Lab 1k Dec 30, 2022
Simulations for Turring patterns on an apically expanding domain. T

Turing patterns on expanding domain Simulations for Turring patterns on an apically expanding domain. The details about the models and numerical imple

Yue Liu 0 Aug 03, 2021
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

凌逆战 16 Dec 30, 2022
Code for testing convergence rates of Lipschitz learning on graphs

📈 LipschitzLearningRates The code in this repository reproduces the experimental results on convergence rates for k-nearest neighbor graph infinity L

2 Dec 20, 2021
NVIDIA Deep Learning Examples for Tensor Cores

NVIDIA Deep Learning Examples for Tensor Cores Introduction This repository provides State-of-the-Art Deep Learning examples that are easy to train an

NVIDIA Corporation 10k Dec 31, 2022
OneFlow is a performance-centered and open-source deep learning framework.

OneFlow OneFlow is a performance-centered and open-source deep learning framework. Latest News Version 0.5.0 is out! First class support for eager exe

OneFlow 4.2k Jan 07, 2023
In this work, we will implement some basic but important algorithm of machine learning step by step.

WoRkS continued English 中文 Français Probability Density Estimation-Non-Parametric Methods(概率密度估计-非参数方法) 1. Kernel / k-Nearest Neighborhood Density Est

liziyu0104 1 Dec 30, 2021
PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

DECOR-GAN PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement, Zhiqin Chen, Vladimir G. Kim, Matthew Fish

Zhiqin Chen 72 Dec 31, 2022
Lua-parser-lark - An out-of-box Lua parser written in Lark

An out-of-box Lua parser written in Lark Such parser handles a relaxed version o

Taine Zhao 2 Jul 19, 2022
Official implementation of VaxNeRF (Voxel-Accelearated NeRF).

VaxNeRF Paper | Google Colab This is the official implementation of VaxNeRF (Voxel-Accelearated NeRF). VaxNeRF provides very fast training and slightl

naruya 132 Nov 21, 2022
Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

TF Watcher TF Watcher is a simple to use Python package and web app which allows you to monitor 👀 your Machine Learning training or testing process o

Rishit Dagli 54 Nov 01, 2022
A PyTorch implementation of "Capsule Graph Neural Network" (ICLR 2019).

CapsGNN ⠀⠀ A PyTorch implementation of Capsule Graph Neural Network (ICLR 2019). Abstract The high-quality node embeddings learned from the Graph Neur

Benedek Rozemberczki 1.2k Jan 02, 2023
Conformer: Local Features Coupling Global Representations for Visual Recognition

Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv) This repository is built upon DeiT and timm Usage First, inst

Zhiliang Peng 378 Jan 08, 2023
Dieser Scanner findet Websites, die nicht direkt in Suchmaschinen auftauchen, aber trotzdem erreichbar sind.

Deep Web Scanner Dieses Script findet Websites, die per IPv4-Adresse erreichbar sind und speichert deren Metadaten. Die Ausgabe im Terminal wird nach

Alex K. 30 Nov 18, 2022
Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Knover Knover is a toolkit for knowledge grounded dialogue generation based on PaddlePaddle. Knover allows researchers and developers to carry out eff

607 Dec 31, 2022
Boostcamp CV Serving For Python

Boostcamp-CV-Serving Prerequisites MySQL GCP Cloud Storage GCP key file Sentry Streamlit Cloud Secrets: .streamlit/secrets.toml #DO NOT SHARE THIS I

Jungwon Seo 19 Feb 22, 2022
A Deep Reinforcement Learning Framework for Stock Market Trading

DQN-Trading This is a framework based on deep reinforcement learning for stock market trading. This project is the implementation code for the two pap

61 Jan 01, 2023