Complete U-net Implementation with keras

Overview

U Net Lowered with Keras

Complete U-net Implementation with keras






Original Paper Link : https://arxiv.org/abs/1505.04597

Special Implementations :


The model is implemented using the original paper. But I have changed the number of filters of the layers. The implemented number of layers are reduced to 25% of the original paper.

Original Model Architecture :

Dataset :


The dataset has been taken from kaggle . It had a specific directory tree, but it was tough to execute dataset building from it, so I prepared an usable dat directory.

Link : https://www.kaggle.com/azkihimmawan/chest-xray-masks-and-defect-detection

Primary Directory Tree :

.
└── root/
    ├── train_images/
    │   └── id/
    │       ├── images/
    │       │   └── id.png
    │       └── masks/
    │           └── id.png
    └── test_images/
        └── id/
            └── id.png

Given Images :

Image Mask

Supporting Libraries :

Numpy opencv Matplotlib

Library Versions :

All versions are up to date as per 14th June 2021.

Dataset Directory Generation :


We have performed operations to ceate the data directory like this :

              .
              └── root/
                  ├── train/
                  │   ├── images/
                  │   │   └── id.png
                  │   └── masks/
                  │       └── id.png
                  └── test/
                      └── id.png

Model Architecture ( U-Net Lowered ):

Model: “UNet-Lowered”

Layer Type Output Shape Param Connected to
input_1 (InputLayer) [(None, 512, 512, 1) 0
conv2d (Conv2D) (None, 512, 512, 16) 160 input_1[0][0]
conv2d_1 (Conv2D) (None, 512, 512, 16) 2320 conv2d[0][0]
max_pooling2d (MaxPooling2D) (None, 256, 256, 16) 0 conv2d_1[0][0]
conv2d_2 (Conv2D) (None, 256, 256, 32) 4640 max_pooling2d[0][0]
conv2d_3 (Conv2D) (None, 256, 256, 32) 9248 conv2d_2[0][0]
max_pooling2d_1 (MaxPooling2D) (None, 128, 128, 32) 0 conv2d_3[0][0]
conv2d_4 (Conv2D) (None, 128, 128, 64) 18496 max_pooling2d_1[0][0]
conv2d_5 (Conv2D) (None, 128, 128, 64) 36928 conv2d_4[0][0]
max_pooling2d_2 (MaxPooling2D) (None, 64, 64, 64) 0 conv2d_5[0][0]
conv2d_6 (Conv2D) (None, 64, 64, 128) 73856 max_pooling2d_2[0][0]
conv2d_7 (Conv2D) (None, 64, 64, 128) 147584 conv2d_6[0][0]
dropout (Dropout) (None, 64, 64, 128) 0 conv2d_7[0][0]
max_pooling2d_3 (MaxPooling2D) (None, 32, 32, 128) 0 dropout[0][0]
conv2d_8 (Conv2D) (None, 32, 32, 256) 295168 max_pooling2d_3[0][0]
conv2d_9 (Conv2D) (None, 32, 32, 256) 590080 conv2d_8[0][0]
dropout_1 (Dropout) (None, 32, 32, 256) 0 conv2d_9[0][0]
up_sampling2d (UpSampling2D) (None, 64, 64, 256) 0 dropout_1[0][0]
conv2d_10 (Conv2D) (None, 64, 64, 128) 131200 up_sampling2d[0][0]
concatenate (Concatenate) (None, 64, 64, 256) 0 dropout[0][0] & conv2d_10[0][0]
conv2d_11 (Conv2D) (None, 64, 64, 128) 295040 concatenate[0][0]
conv2d_12 (Conv2D) (None, 64, 64, 128) 147584
up_sampling2d_1 (UpSampling2D) (None, 128, 128, 128) 0 conv2d_12[0][0]
conv2d_13 (Conv2D) (None, 128, 128, 64) 32832 up_sampling2d_1[0][0]
concatenate_1 (Concatenate) (None, 128, 128, 128) 0 conv2d_5[0][0] & conv2d_13[0][0]
conv2d_14 (Conv2D) (None, 128, 128, 64) 73792 concatenate_1[0][0]
conv2d_15 (Conv2D) (None, 128, 128, 64) 36928 conv2d_14[0][0]
up_sampling2d_2 (UpSampling2D) (None, 256, 256, 64) 0 conv2d_15[0][0]
conv2d_16 (Conv2D) (None, 256, 256, 32) 8224 up_sampling2d_2[0][0]
concatenate_2 (Concatenate) (None, 256, 256, 64) 0 conv2d_3[0][0] & conv2d_16[0][0]
conv2d_17 (Conv2D) (None, 256, 256, 32) 18464 concatenate_2[0][0]
conv2d_18 (Conv2D) (None, 256, 256, 32) 9248 conv2d_17[0][0]
up_sampling2d_3 (UpSampling2D) (None, 512, 512, 32) 0 conv2d_18[0][0]
conv2d_19 (Conv2D) (None, 512, 512, 16) 2064 up_sampling2d_3[0][0]
concatenate_3 (Concatenate) (None, 512, 512, 32) 0 conv2d_1[0][0] & conv2d_19[0][0]
conv2d_20 (Conv2D) (None, 512, 512, 16) 4624 concatenate_3[0][0]
conv2d_21 (Conv2D) (None, 512, 512, 16) 2320 conv2d_20[0][0]
conv2d_22 (Conv2D) (None, 512, 512, 2) 290 conv2d_21[0][0]
conv2d_23 (Conv2D) (None, 512, 512, 1) 3 conv2d_22[0][0]

Data Preparation :

Taken single channels of both image and mask for training.

Hyperparameters :

      Image Shape : (512 , 512 , 1)
      Optimizer : Adam ( Learning Rate : 1e-4 )
      Loss : Binary Cross Entropy 
      Metrics : Accuracy
      Epochs on Training : 100
      Train Validation Ratio : ( 85%-15% )
      Batch Size : 10

Model Evaluation Metrics :

Model Performance on Train Data :

Model Performance on Validation Data :

One task left : Will update the tutorial notebooks soon ;)

Conclusion :

The full model on the simpliefied 1 channel images was giving bad overfitted accuracy. But this structure shows better and efficient tuning over the data.

STAR the repository if this was helpful :) Also follow me on kaggle and Linkedin.

THANK YOU for visiting :)

Owner
Sagnik Roy
Kaggle Expert exploring Computer Vision as no one did!
Sagnik Roy
Complementary Patch for Weakly Supervised Semantic Segmentation, ICCV21 (poster)

CPN (ICCV2021) This is an implementation of Complementary Patch for Weakly Supervised Semantic Segmentation, which is accepted by ICCV2021 poster. Thi

Ferenas 20 Dec 12, 2022
A library that allows for inference on probabilistic models

Bean Machine Overview Bean Machine is a probabilistic programming language for inference over statistical models written in the Python language using

Meta Research 234 Dec 29, 2022
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.

[CVPR2022] Thin-Plate Spline Motion Model for Image Animation Source code of the CVPR'2022 paper "Thin-Plate Spline Motion Model for Image Animation"

yoyo-nb 1.4k Dec 30, 2022
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

Phil Wang 189 Nov 22, 2022
Visualizing Yolov5's layers using GradCam

YOLO-V5 GRADCAM I constantly desired to know to which part of an object the object-detection models pay more attention. So I searched for it, but I di

Pooya Mohammadi Kazaj 200 Jan 01, 2023
Answering Open-Domain Questions of Varying Reasoning Steps from Text

This repository contains the authors' implementation of the Iterative Retriever, Reader, and Reranker (IRRR) model in the EMNLP 2021 paper "Answering Open-Domain Questions of Varying Reasoning Steps

26 Dec 22, 2022
ROS Basics and TurtleSim

Waypoint Follower Anna Garverick This package draws given waypoints, then waits for a service call with a start position to send the turtle to each wa

Anna Garverick 1 Dec 13, 2021
ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

Sequence Generation with GANs trained by Gradient Estimation Requirements: PyTorch v0.3 Python 3.6 CUDA 9.1 (For GPU) Origin The idea is from paper Se

40 Nov 03, 2022
Hardware-accelerated DNN model inference ROS2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

Isaac ROS DNN Inference Overview This repository provides two NVIDIA GPU-accelerated ROS2 nodes that perform deep learning inference using custom mode

NVIDIA Isaac ROS 62 Dec 14, 2022
Apache Flink

Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin

The Apache Software Foundation 20.4k Dec 30, 2022
The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems This repository includes the dataset, experiments results, and code for the paper: Few-Shot B

Andrea Madotto 103 Dec 28, 2022
3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks Introduction This repository contains the code and models for the follo

124 Jan 06, 2023
Pose estimation with MoveNet Lightning

Pose Estimation With MoveNet Lightning MoveNet is the TensorFlow pre-trained model that identifies 17 different key points of the human body. It is th

Yash Vora 2 Jan 04, 2022
This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

GPRGNN This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network. Hidden state feature extraction i

Jianhao 92 Jan 03, 2023
This is a Image aid classification software based on python TK library development

This is a Image aid classification software based on python TK library development.

EasonChan 1 Jan 17, 2022
structured-generative-modeling

This repository contains the implementation for the paper Information Theoretic StructuredGenerative Modeling, Specially thanks for the open-source co

0 Oct 11, 2021
Motion planning environment for Sampling-based Planners

Sampling-Based Motion Planners' Testing Environment Sampling-based motion planners' testing environment (sbp-env) is a full feature framework to quick

Soraxas 23 Aug 23, 2022
MEDS: Enhancing Memory Error Detection for Large-Scale Applications

MEDS: Enhancing Memory Error Detection for Large-Scale Applications Prerequisites cmake and clang Build MEDS supporting compiler $ make Build Using Do

Secomp Lab at Purdue University 34 Dec 14, 2022
Search Youtube Video and Get Video info

PyYouTube Get Video Data from YouTube link Installation pip install PyYouTube How to use it ? Get Videos Data from pyyoutube import Data yt = Data("ht

lokaman chendekar 35 Nov 25, 2022
MERLOT: Multimodal Neural Script Knowledge Models

merlot MERLOT: Multimodal Neural Script Knowledge Models MERLOT is a model for learning what we are calling "neural script knowledge" -- representatio

Rowan Zellers 190 Dec 22, 2022