使用Keras稀疏分类交叉熵进行像素级多类分类 [英] Use of Keras Sparse Categorical Crossentropy for pixel-wise multi-class classification

查看:132
本文介绍了使用Keras稀疏分类交叉熵进行像素级多类分类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我将公开自己是机器学习和Keras的新手,对通用CNN二进制分类器的了解并不多.我正在尝试在许多256x256图像上使用U-Net架构(TF后端)执行逐像素的多类分类.换句话说,我输入一个256x256的图像,并希望它输出一个256x256的掩码"(或标签图像),其中值是0到30之间的整数(每个整数代表一个唯一的类).我正在使用2个1080Ti NVIDIA GPU进行培训.

I'll start by disclosing that I'm a machine learning and Keras novice and don't know much beyond general CNN binary classifiers. I'm trying to perform pixelwise multi-class classification using a U-Net architecture (TF backend) on many 256x256 images. In other words, I input a 256x256 image, and I want it to output a 256x256 "mask" (or label image) where the values are integers from 0-30 (each integer represents a unique class). I'm training on 2 1080Ti NVIDIA GPUs.

当我尝试执行一次热编码时,出现OOM错误,这就是为什么我将稀疏分类交叉熵用作损失函数而不是常规分类交叉熵的原因.但是,当训练我的U-Net时,我的损失值从开始到结束都是"nan"(它初始化为nan,并且永远不会改变).当我通过将所有值除以30来归一化我的蒙版"(所以它们从0-1开始)时,我得到〜0.97的准确度,这是因为我图像中的大多数标签都是0(它只是输出)一堆0).

When I attempt to perform one-hot encoding, I get an OOM error, which is why I'm using sparse categorical cross entropy as my loss function instead of regular categorical cross entropy. However, when training my U-Net, my loss value is "nan" from start to finish (it initializes as nan and never changes). When I normalize my "masks" by dividing all values by 30 (so they go from 0-1), I get ~0.97 accuracy, which I'm guessing is because most of the labels in my image are 0 (and it's just outputting a bunch of 0s).

这是我正在使用的U-Net:

Here's the U-Net I'm using:

def unet(pretrained_weights = None,input_size = (256,256,1)):
inputs = keras.engine.input_layer.Input(input_size)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
#drop4 = Dropout(0.5)(conv4)
drop4 = SpatialDropout2D(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)

conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
#drop5 = Dropout(0.5)(conv5)
drop5 = SpatialDropout2D(0.5)(conv5)

up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
merge6 = concatenate([drop4,up6], axis = 3)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6)

up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
merge7 = concatenate([conv3,up7], axis = 3)
conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7)
conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7)

up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
merge8 = concatenate([conv2,up8], axis = 3)
conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8)
conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8)

up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
merge9 = concatenate([conv1,up9], axis = 3)
conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9)
conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
conv9 = Conv2D(32, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
conv10 = Conv2D(1, 1, activation = 'softmax')(conv9)
#conv10 = Flatten()(conv10)
#conv10 = Dense(65536, activation = 'softmax')(conv10)
flat10 = Reshape((65536,1))(conv10)
#conv10 = Conv1D(1, 1, activation='linear')(conv10)

model = Model(inputs = inputs, outputs = flat10)

opt = Adam(lr=1e-6,clipvalue=0.01)
model.compile(optimizer = opt, loss = 'sparse_categorical_crossentropy', metrics = ['sparse_categorical_accuracy'])
#model.compile(optimizer = Adam(lr = 1e-6), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
#model.compile(optimizer = Adam(lr = 1e-4),

#model.summary()

if(pretrained_weights):

    model.load_weights(pretrained_weights)

return model

请注意,为了使稀疏分类交叉熵正常运行(出于某种原因,它不喜欢我的2D矩阵),我需要对输出进行展平.

Note that I needed to flatten the output just to get sparse categorical cross entropy to function properly (it didn't like my 2D matrix for some reason).

这是一个训练跑步的示例(因为每次跑步都相同,所以只有1个纪元)

And here's an example of a training run (just 1 epoch because it's the same no matter how many I run)

model = unet()
model.fit(x=x_train, y=y_train, batch_size=1, epochs=1, verbose=1, validation_split=0.2, shuffle=True)

训练2308个样本,验证577个样本 时代1/1 2308/2308 [==============================]-191s 83ms/step-损失:nan-sparse_categorical_accuracy:0.9672-val_loss :nan-val_sparse_categorical_accuracy:0.9667 出[18]:

Train on 2308 samples, validate on 577 samples Epoch 1/1 2308/2308 [==============================] - 191s 83ms/step - loss: nan - sparse_categorical_accuracy: 0.9672 - val_loss: nan - val_sparse_categorical_accuracy: 0.9667 Out[18]:

让我知道是否需要更多信息来诊断问题.预先感谢!

Let me know if more information is needed to diagnose the problem. Thanks in advance!

推荐答案

问题是,对于多类分类,您需要输出一个矢量/每个类别一个维度,该向量表示对该类别的置信度.如果要识别30个不同的类,则最后一层应为3D张量(256、256、30).

The problem is that for multiclass classification, you need to output a vector with one dimension per category, which represents the confidence in that category. If you want to identify 30 different classes, then your final layer should be a 3D tensor, (256, 256, 30).

conv10 = Conv2D(30, 1, activation = 'softmax')(conv9)
flat10 = Reshape((256*256*30,1))(conv10)

opt = Adam(lr=1e-6,clipvalue=0.01)
model.compile(optimizer = opt, loss = 'sparse_categorical_crossentropy', metrics = 
['sparse_categorical_accuracy'])

我假设您的输入是一个(256、256、1)浮点张量,其值介于0和1之间,而您的目标是一个(256 * 256)整数张量.

I'm assuming that your input is a (256, 256, 1) float tensor with values between 0 and 1, and your target is a (256*256) Int tensor.

有帮助吗?

这篇关于使用Keras稀疏分类交叉熵进行像素级多类分类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆