这是一个unet-pytorch的源码，可以训练自己的模型

Last update: Jan 05, 2023

Related tags

Overview

Unet：U-Net: Convolutional Networks for Biomedical Image Segmentation目标检测模型在Pytorch当中的实现

性能情况

unet并不适合VOC此类数据集，其更适合特征少，需要浅层特征的医药数据集之类的。

训练数据集	权值文件名称	测试数据集	输入图片大小	mIOU
VOC12+SBD	unet_voc.pth	VOC-Val12	512x512	55.11

所需环境

torch==1.2.0
torchvision==0.4.0

注意事项

unet_voc.pth是基于VOC拓展数据集训练的。
unet_medical.pth是使用示例的细胞分割数据集训练的。
在使用时需要注意区分。

文件下载

训练所需的unet_voc.pth和unet_medical.pth可在百度网盘中下载。
链接: https://pan.baidu.com/s/1AUBpqsSgamoQGEYpNjJg7A 提取码: i3ck

VOC拓展数据集的百度网盘如下：
链接: https://pan.baidu.com/s/1BrR7AUM1XJvPWjKMIy2uEw 提取码: vszf

预测步骤

一、使用预训练权重

a、VOC预训练权重

下载完库后解压，如果想要利用voc训练好的权重进行预测，在百度网盘或者release下载unet_voc.pth，放入model_data，运行即可预测。

img/street.jpg

利用video.py可进行摄像头检测。

b、医药预训练权重

下载完库后解压，如果想要利用医药数据集训练好的权重进行预测，在百度网盘或者release下载unet_medical.pth，放入model_data，修改unet.py中的model_path和num_classes；

_defaults = {
    "model_path"        : 'model_data/unet_voc.pth',
    "model_image_size"  : (512, 512, 3),
    "num_classes"       : 21,
    "cuda"              : True,
    #--------------------------------#
    #   blend参数用于控制是否
    #   让识别结果和原图混合
    #--------------------------------#
    "blend"             : True
}

运行即可预测。

img/cell.png

二、使用自己训练的权重

按照训练步骤训练。
在unet.py文件里面，在如下部分修改model_path、backbone和num_classes使其对应训练好的文件；model_path对应logs文件夹下面的权值文件。

_defaults = {
    "model_path"        : 'model_data/unet_voc.pth',
    "model_image_size"  : (512, 512, 3),
    "num_classes"       : 21,
    "cuda"              : True,
    #--------------------------------#
    #   blend参数用于控制是否
    #   让识别结果和原图混合
    #--------------------------------#
    "blend"             : True
}

运行predict.py，输入

img/street.jpg

利用video.py可进行摄像头检测。

训练步骤

一、训练voc数据集

将我提供的voc数据集放入VOCdevkit中（无需运行voc2unet.py）。
在train.py中设置对应参数，默认参数已经对应voc数据集所需要的参数了，所以只要修改backbone和model_path即可。
运行train.py进行训练。

二、训练自己的数据集

本文使用VOC格式进行训练。
训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的SegmentationClass中。
训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
在训练前利用voc2unet.py文件生成对应的txt。
注意修改train.py的num_classes为分类个数+1。
运行train.py即可开始训练。

三、训练医药数据集

下载VGG的预训练权重到model_data下面。
按照默认参数运行train_medical.py即可开始训练。

miou计算

参考miou计算视频和博客。

Reference

https://github.com/ggyyzm/pytorch_segmentation
https://github.com/bonlime/keras-deeplab-v3-plus

Comments

询问一下预训练的问题

你好，打扰了。我是想问下主干模型是指的是在下采样过程中使用的vgg吗？如果我不改变上采样是不是就不用使用imagenet训练。然后注销掉model_path=‘’ 以及 if model_path !=‘’这段。然后使用自己的数据集去进行训练。谢谢大佬！！！！！！。实际上大佬你的voc的权重文件是不是为二次预训练的数据。不好意思，语言表达能力不行。俺不晓得这样说大佬明不明白。

opened by Nine9844 5
训练一段时间后，CE loss变为NAN

您好，看了您的教程我试着自己搭建了一个U-Net模型，并采用Dice + CE loss作为损失函数，但在迭代几十个epoch后，我的CE loss返回了NAN值，反馈的结果是 ‘Function 'LogSoftmaxBackward' returned nan values in its 0th output.’ 同样的数据在您源码上运行没有出现这个问题，请问您是否知道些解决方法？

opened by Breeze-Zero 2
为啥在dataloader第40行转换的array的shape和cv2不一样呢

我使用json_to_dataset.py转化mask后尝试使用代码查看shape import cv2 import numpy as np from PIL import Image

file = '/home/fut/Downloads/unet-pytorch-main/mydata/masks/ID_1110_json.png' img = cv2.imread(file, cv2.IMREAD_UNCHANGED) print(img.shape)

pil = Image.open(file) img2 = np.array(pil) print(img2.shape) 结果会是： (800, 800, 3) (800, 800) 为什么PIL读取后通道就没了，正是因为这个原因你的项目会很好跑起来。

opened by futureflsl 1
from tqdm import tqdm 报错

import os import time

import numpy as np import torch import torch.backends.cudnn as cudnn import torch.optim as optim from torch.utils.data import DataLoader from tqdm import tqdm

opened by Luke-Wei 1

Releases(v3.0)

v3.0(Apr 22, 2022)
重要更新

支持step、cos学习率下降法。

支持adam、sgd优化器选择。

支持不同预测模式的选择，单张图片预测、文件夹预测、视频预测、图片裁剪。

更新summary.py文件，用于观看网络结构。

增加了多GPU训练。

Source code(tar.gz)
Source code(zip)
v2.2(Mar 4, 2022)
重要更新

更新train.py文件，增加了大量的注释，增加多个可调整参数。

更新predict.py文件，增加了大量的注释，增加fps、视频预测、批量预测等功能。

更新unet.py文件，增加了大量的注释，增加先验框选择、置信度、非极大抑制等参数。

合并get_dr_txt.py、get_gt_txt.py和get_map.py文件，通过一个文件来实现数据集的评估。

更新voc_annotation.py文件，增加多个可调整参数。

更新callback.py文件，防止多线程错误。

更新summary.py文件，用于观看网络结构。

Source code(tar.gz)
Source code(zip)
v1.0(Mar 12, 2021)

Source code(tar.gz)
Source code(zip)
unet_resnet_medical.pth(167.85 MB)
unet_resnet_voc.pth(167.85 MB)
unet_vgg_medical.pth(94.96 MB)
unet_vgg_voc.pth(94.96 MB)

Owner

Bubbliiiing

GitHub Repository

[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

CORE This is the official PyTorch implementation for the paper: Yupeng Hou, Binbin Hu, Zhiqiang Zhang, Wayne Xin Zhao. CORE: Simple and Effective Sess

26 Dec 19, 2022

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

270 Jan 07, 2023

An implementation of the WHATWG URL Standard in JavaScript

whatwg-url whatwg-url is a full implementation of the WHATWG URL Standard. It can be used standalone, but it also exposes a lot of the internal algori

314 Dec 28, 2022

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

183 Jan 03, 2023

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models (published in ICLR2018)

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models Pouya Samangouei*, Maya Kabkab*, Rama Chellappa [*: authors co

212 Dec 07, 2022

Implementation of our recent paper, WOOD: Wasserstein-based Out-of-Distribution Detection.

WOOD Implementation of our recent paper, WOOD: Wasserstein-based Out-of-Distribution Detection. Abstract The training and test data for deep-neural-ne

8 Dec 24, 2022

Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

150 Dec 07, 2022

Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

FPT_data_centric_competition - Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

2 Oct 30, 2022

Tutorial materials for Part of NSU Intro to Deep Learning with PyTorch.

Intro to Deep Learning Materials are part of North South University (NSU) Intro to Deep Learning with PyTorch workshop series. (Slides) Related materi

9 Jun 08, 2022

Using CNN to mimic the driver based on training data from Torcs

Behavioural-Cloning-in-autonomous-driving Using CNN to mimic the driver based on training data from Torcs. Approach First, the data was collected from

2 Jan 05, 2022

A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

CLIP4CMR A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval The original data and pre-calculate

24 Dec 26, 2022

Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

RealTime Sign Language Detection using Action Recognition Approach Real-Time Sign Language is commonly predicted using models whose architecture consi

15 Aug 20, 2022

Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert

149 Dec 14, 2022

这是一个unet-pytorch的源码，可以训练自己的模型

Related tags

Overview

Unet：U-Net: Convolutional Networks for Biomedical Image Segmentation目标检测模型在Pytorch当中的实现

目录

性能情况

所需环境

注意事项

文件下载

预测步骤

一、使用预训练权重

a、VOC预训练权重

b、医药预训练权重

二、使用自己训练的权重

训练步骤

一、训练voc数据集

二、训练自己的数据集

三、训练医药数据集

miou计算

Reference

You might also like...

Comments

询问一下预训练的问题

训练一段时间后，CE loss变为NAN

为啥在dataloader第40行转换的array的shape和cv2不一样呢

from tqdm import tqdm 报错

Releases(v3.0)

v3.0(Apr 22, 2022)

重要更新

v2.2(Mar 4, 2022)

重要更新

v1.0(Mar 12, 2021)

Owner

Bubbliiiing

[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

An implementation of the WHATWG URL Standard in JavaScript

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models (published in ICLR2018)

Implementation of our recent paper, WOOD: Wasserstein-based Out-of-Distribution Detection.

Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Team nan solution repository for FPT data-centric competition. Data augmentation, Albumentation, Mosaic, Visualization, KNN application

Tutorial materials for Part of NSU Intro to Deep Learning with PyTorch.

Using CNN to mimic the driver based on training data from Torcs

A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT).

MARE - Multi-Attribute Relation Extraction

PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Vector Quantized Diffusion Model for Text-to-Image Synthesis

Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)