Unet

this is an implemetion of Unet in Pytorch and it's architecture is as follows which is the same with paper of Unet

component of Unet

Convolution and downsampling

two layers of convolution which consists of BatchNorm and Relu and then downsampling

class TwoConv(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(TwoConv, self).__init__()
        self.twoconv = nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size=3),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True),
            nn.Conv2d(out_channels, out_channels, kernel_size=3),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True),
        )

    def forward(self, x):
        return self.twoconv(x)

class TwoConvDown(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(TwoConvDown, self).__init__()
        self.twoconvdown = nn.Sequential(
            nn.MaxPool2d(2),
            TwoConv(in_channels, out_channels),
        )

    def forward(self, x):
        return self.twoconvdown(x)

Upsampling and Concatation

there are two modes, "pad" and "crop" to deal with the different size of two parts in Unet. "crop" is the same operation with paper of Unet.

class UpCat(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(UpCat, self).__init__()
        self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2)
        self.twoconv = TwoConv(in_channels=in_channels, out_channels=out_channels)

    def forward(self, x1, x2, mode="pad"):
        '''
        :param x1: Unet right part, size is samller
        :param x2: Unet left part，size is larger
        :return:
        '''
        x1 = self.up(x1)
        diffY = x2.size()[2] - x1.size()[2]
        diffX = x2.size()[3] - x1.size()[3]

        if mode == "pad":
            x1 = nn.functional.pad(x1, (diffX // 2, diffX - diffX // 2, diffY // 2, diffY - diffY // 2))
        elif mode == "crop":
            left = diffX // 2
            right = diffX - left
            up = diffY // 2
            down = diffY - up

            x2 = x2[:, :, left:x2.size()[2]-right, up:x2.size()[3]-down]

        x = torch.cat([x2, x1], dim=1)
        x = self.twoconv(x)
        return x

main part of Unet

class Unet(nn.Module):
    def __init__(self, in_channels,
                 channel_list: list = [64, 128, 256, 512, 1024],
                 length = 5,
                 mode = "pad")

This is a file about Unet implemented in Pytorch

Related tags

Overview

Unet

component of Unet

Convolution and downsampling

Upsampling and Concatation

main part of Unet

Owner

Dragon

Code for Overinterpretation paper Overinterpretation reveals image classification model pathologies

VACA: Designing Variational Graph Autoencoders for Interventional and Counterfactual Queries

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

For medical image segmentation

NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

AFLNet: A Greybox Fuzzer for Network Protocols

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

zeus is a Python implementation of the Ensemble Slice Sampling method.

Measures input lag without dedicated hardware, performing motion detection on recorded or live video

IA for recognising Traffic Signs using Keras [Tensorflow]

[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Code for CVPR 2018 paper --- Texture Mapping for 3D Reconstruction with RGB-D Sensor

Eff video representation - Efficient video representation through neural fields

Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

This project deploys a yolo fastest model in the form of tflite on raspberry 3b+. The model is from another repository of mine called -Trash-Classification-Car