SemTorch

Overview

SemTorch

This repository contains different deep learning architectures definitions that can be applied to image segmentation.

All the architectures are implemented in PyTorch and can been trained easily with FastAI 2.

In Deep-Tumour-Spheroid repository can be found and example of how to apply it with a custom dataset, in that case brain tumours images are used.

These architectures are classified as:

  • Semantic Segmentation: each pixel of an image is linked to a class label. Semantic Segmentation
  • Instance Segmentation: is similar to semantic segmentation, but goes a bit deeper, it identifies , for each pixel, the object instance it belongs to. Instance Segmentation
  • Salient Object Detection (Binary clases only): detection of the most noticeable/important object in an image. Salient Object Detection

๐Ÿš€ Getting Started

To start using this package, install it using pip:

For example, for installing it in Ubuntu use:

pip3 install SemTorch

๐Ÿ‘ฉโ€๐Ÿ’ป Usage

This package creates an abstract API to access a segmentation model of different architectures. This method returns a FastAI 2 learner that can be combined with all the fastai's functionalities.

# SemTorch
from semtorch import get_segmentation_learner

learn = get_segmentation_learner(dls=dls, number_classes=2, segmentation_type="Semantic Segmentation",
                                 architecture_name="deeplabv3+", backbone_name="resnet50", 
                                 metrics=[tumour, Dice(), JaccardCoeff()],wd=1e-2,
                                 splitter=segmentron_splitter).to_fp16()

You can find a deeper example in Deep-Tumour-Spheroid repository, in this repo the package is used for the segmentation of brain tumours.

def get_segmentation_learner(dls, number_classes, segmentation_type, architecture_name, backbone_name,
                             loss_func=None, opt_func=Adam, lr=defaults.lr, splitter=trainable_params, 
                             cbs=None, pretrained=True, normalize=True, image_size=None, metrics=None, 
                             path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True,
                             moms=(0.95,0.85,0.95)):

This function return a learner for the provided architecture and backbone

Parameters:

  • dls (DataLoader): the dataloader to use with the learner
  • number_classes (int): the number of clases in the project. It should be >=2
  • segmentation_type (str): just Semantic Segmentation accepted for now
  • architecture_name (str): name of the architecture. The following ones are supported: unet, deeplabv3+, hrnet, maskrcnn and u2^net
  • backbone_name (str): name of the backbone
  • loss_func (): loss function.
  • opt_func (): opt function.
  • lr (): learning rates
  • splitter (): splitter function for freazing the learner
  • cbs (List[cb]): list of callbacks
  • pretrained (bool): it defines if a trained backbone is needed
  • normalize (bool): if normalization is applied
  • image_size (int): REQUIRED for MaskRCNN. It indicates the desired size of the image.
  • metrics (List[metric]): list of metrics
  • path (): path parameter
  • model_dir (str): the path in which save models
  • wd (float): wieght decay
  • wd_bn_bias (bool):
  • train_bn (bool):
  • moms (Tuple(float)): tuple of different momentuns

Returns:

  • learner: value containing the learner object

Supported configs

Architecture supported config backbones
unet Semantic Segmentation,binary Semantic Segmentation,multiple resnet18, resnet34, resnet50, resnet101, resnet152, xresnet18, xresnet34, xresnet50, xresnet101, xresnet152, squeezenet1_0, squeezenet1_1, densenet121, densenet169, densenet201, densenet161, vgg11_bn, vgg13_bn, vgg16_bn, vgg19_bn, alexnet
deeplabv3+ Semantic Segmentation,binary Semantic Segmentation,multiple resnet18, resnet34, resnet50, resnet101, resnet152, resnet50c, resnet101c, resnet152c, xception65, mobilenet_v2
hrnet Semantic Segmentation,binary Semantic Segmentation,multiple hrnet_w18_small_model_v1, hrnet_w18_small_model_v2, hrnet_w18, hrnet_w30, hrnet_w32, hrnet_w48
maskrcnn Semantic Segmentation,binary resnet50
u2^net Semantic Segmentation,binary small, normal

๐Ÿ“ฉ Contact

๐Ÿ“ง [email protected]

๐Ÿ’ผ Linkedin David Lacalle Castillo

Owner
David Lacalle Castillo
Machine Learning Engineer
David Lacalle Castillo
Python bindings for JIGSAW: a Delaunay-based unstructured mesh generator.

JIGSAW: An unstructured mesh generator JIGSAW is an unstructured mesh generator and tessellation library; designed to generate high-quality triangulat

Darren Engwirda 26 Dec 13, 2022
Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract Toolset U^2-Net is used for background removal Textcleaner is used for image cleaning

3 Jul 13, 2022
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
Create single line SVG illustrations from your pictures

Create single line SVG illustrations from your pictures

Javier Bรณrquez 686 Dec 26, 2022
A small C++ implementation of LSTM networks, focused on OCR.

clstm CLSTM is an implementation of the LSTM recurrent neural network model in C++, using the Eigen library for numerical computations. Status and sco

Tom 794 Dec 30, 2022
Repository for playing the computer vision apps: People analytics on Raspberry Pi.

play-with-torch Repository for playing the computer vision apps: People analytics on Raspberry Pi. Tools Tested Hardware RasberryPi 4 Model B here, RA

eMHa 1 Sep 23, 2021
Random maze generator and solver

Maze Generator and Solver I wrote a maze generator that works with two commonly known algorithms: Depth First Search and Randomized Prims. Both of the

Daniel Pรฉrez 10 Sep 23, 2022
Single Shot Text Detector with Regional Attention

Single Shot Text Detector with Regional Attention Introduction SSTD is initially described in our ICCV 2017 spotlight paper. A third-party implementat

Pan He 215 Dec 07, 2022
A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Home-Security-Demo A facial recognition program that plays a alarm (mp3 file) when a person is seen in the room. A basic theif using Python and OpenCV

SysKey 4 Nov 02, 2021
A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports "with"-syntax.

Patrice Matz 0 Oct 30, 2021
Polaris is a Face recognition attendance system .

Support Me ๐Ÿš€ About Polaris ๐Ÿ“„ Polaris is a system based on facial recognition with a futuristic GUI design, Can easily find people informations store

XN3UR0N 215 Dec 26, 2022
Simple SDF mesh generation in Python

Generate 3D meshes based on SDFs (signed distance functions) with a dirt simple Python API.

Michael Fogleman 1.1k Jan 08, 2023
A tensorflow implementation of EAST text detector

EAST: An Efficient and Accurate Scene Text Detector Introduction This is a tensorflow re-implementation of EAST: An Efficient and Accurate Scene Text

2.9k Jan 02, 2023
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
This is a real life mario project using python and mediapipe

real-life-mario This is a real life mario project using python and mediapipe How to run to run this just run - realMario.py file requirements This req

Programminghut 42 Dec 22, 2022
An expandable and scalable OCR pipeline

Overview Nidaba is the central controller for the entire OGL OCR pipeline. It oversees and automates the process of converting raw images into citable

81 Jan 04, 2023
MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition Python 2.7 Python 3.6 MORAN is a network with rectification mechanism for

Canjie Luo 595 Dec 27, 2022
Super Mario Game With Python

Super_Mario Hello all this is a simple python program which tries to use our body as a controller for the super mario game Here I have used media pipe

Adarsh Badagala 219 Nov 25, 2022
Text language identification using Wikipedia data

Text language identification using Wikipedia data The aim of this project is to provide high-quality language detection over all the web's languages.

Vsevolod Dyomkin 28 Jul 09, 2022
Controlling the computer volume with your hands // OpenCV

HandsControll-AI Controlling the computer volume with your hands // OpenCV Step 1 git clone https://github.com/Hayk-21/HandsControll-AI.git pip instal

Hayk 1 Nov 04, 2021