ResNet:训练期间准确率为 100%,但相同数据的预测准确率为 33% [英] ResNet: 100% accuracy during training, but 33% prediction accuracy with the same data

查看:35
本文介绍了ResNet:训练期间准确率为 100%,但相同数据的预测准确率为 33%的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是机器学习和深度学习的新手,出于学习目的,我尝试使用 Resnet.我尝试过拟合小数据(3 个不同图像),看看我是否可以获得几乎 0 的损失和 1.0 的准确率 - 我做到了.

I am new to machine learning and deep learning, and for learning purposes I tried to play with Resnet. I tried to overfit over small data (3 different images) and see if I can get almost 0 loss and 1.0 accuracy - and I did.

问题在于对训练图像(即用于训练的相同 3 张图像)的预测不正确..

The problem is that predictions on the training images (i.e. the same 3 images used for training) are not correct..

训练图像

图片标签

[1,0,0], [0,1,0], [0,0,1]

我的python代码

#loading 3 images and resizing them
imgs = np.array([np.array(Image.open("./Images/train/" + fname)
                          .resize((197, 197), Image.ANTIALIAS)) for fname in
                 os.listdir("./Images/train/")]).reshape(-1,197,197,1)
# creating labels
y = np.array([[1,0,0],[0,1,0],[0,0,1]])
# create resnet model
model = ResNet50(input_shape=(197, 197,1),classes=3,weights=None)

# compile & fit model
model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['acc'])

model.fit(imgs,y,epochs=5,shuffle=True)

# predict on training data
print(model.predict(imgs))

模型确实过度拟合了数据:

The model does overfit the data:

3/3 [==============================] - 22s - loss: 1.3229 - acc: 0.0000e+00
Epoch 2/5
3/3 [==============================] - 0s - loss: 0.1474 - acc: 1.0000
Epoch 3/5
3/3 [==============================] - 0s - loss: 0.0057 - acc: 1.0000
Epoch 4/5
3/3 [==============================] - 0s - loss: 0.0107 - acc: 1.0000
Epoch 5/5
3/3 [==============================] - 0s - loss: 1.3815e-04 - acc: 1.0000

但预测是:

 [[  1.05677405e-08   9.99999642e-01   3.95520459e-07]
 [  1.11955103e-08   9.99999642e-01   4.14905685e-07]
 [  1.02637095e-07   9.99997497e-01   2.43751242e-06]]

这意味着所有图像都得到了 label=[0,1,0]

which means that all images got label=[0,1,0]

为什么?那怎么会发生呢?

why? and how can that happen?

推荐答案

这是因为批量归一化层.

It's because of the batch normalization layers.

在训练阶段,批次被标准化 w.r.t.它的均值和方差.然而,在测试阶段,批次被标准化 w.r.t.先前观察到的均值和方差的移动平均值.

In training phase, the batch is normalized w.r.t. its mean and variance. However, in testing phase, the batch is normalized w.r.t. the moving average of previously observed mean and variance.

现在当观察到的批次数量很小(例如,在您的示例中为 5)时,这是一个问题,因为在 BatchNormalization 层中,默认情况下 moving_mean 被初始化为为 0,moving_variance 被初始化为 1.

Now this is a problem when the number of observed batches is small (e.g., 5 in your example) because in the BatchNormalization layer, by default moving_mean is initialized to be 0 and moving_variance is initialized to be 1.

还假定默认 momentum 为 0.99,您需要更新移动平均线很多次,然后才能收敛到真实"均值和差异.

Given also that the default momentum is 0.99, you'll need to update the moving averages quite a lot of times before they converge to the "real" mean and variance.

这就是为什么早期的预测是错误的,但在 1000 个 epoch 之后是正确的.

That's why the prediction is wrong in the early stage, but is correct after 1000 epochs.

您可以通过强制 BatchNormalization 层在训练模式"下运行来验证它.

You can verify it by forcing the BatchNormalization layers to operate in "training mode".

训练时准确率为1,loss接近于0:

During training, the accuracy is 1 and the loss is close to zero:

model.fit(imgs,y,epochs=5,shuffle=True)
Epoch 1/5
3/3 [==============================] - 19s 6s/step - loss: 1.4624 - acc: 0.3333
Epoch 2/5
3/3 [==============================] - 0s 63ms/step - loss: 0.6051 - acc: 0.6667
Epoch 3/5
3/3 [==============================] - 0s 57ms/step - loss: 0.2168 - acc: 1.0000
Epoch 4/5
3/3 [==============================] - 0s 56ms/step - loss: 1.1921e-07 - acc: 1.0000
Epoch 5/5
3/3 [==============================] - 0s 53ms/step - loss: 1.1921e-07 - acc: 1.0000

现在,如果我们评估模型,我们将观察到高损失和低准确率,因为在 5 次更新后,移动平均线仍然非常接近初始值:

Now if we evaluate the model, we'll observe high loss and low accuracy because after 5 updates, the moving averages are still pretty close to the initial values:

model.evaluate(imgs,y)
3/3 [==============================] - 3s 890ms/step
[10.745396614074707, 0.3333333432674408]

但是,如果我们手动指定学习阶段"变量并让 BatchNormalization 层使用真实"批次均值和方差,结果将与 fit 中观察到的相同().

However, if we manually specify the "learning phase" variable and let the BatchNormalization layers use the "real" batch mean and variance, the result becomes the same as what's observed in fit().

sample_weights = np.ones(3)
learning_phase = 1  # 1 means "training"
ins = [imgs, y, sample_weights, learning_phase]
model.test_function(ins)
[1.192093e-07, 1.0]

<小时>

也可以通过将动量更改为较小的值来验证.


It's also possible to verify it by changing the momentum to a smaller value.

例如,通过在ResNet50中的所有batch norm层添加momentum=0.01,20个epochs后的预测为:

For example, by adding momentum=0.01 to all the batch norm layers in ResNet50, the prediction after 20 epochs is:

model.predict(imgs)
array([[  1.00000000e+00,   1.34882026e-08,   3.92139575e-22],
       [  0.00000000e+00,   1.00000000e+00,   0.00000000e+00],
       [  8.70998792e-06,   5.31159838e-10,   9.99991298e-01]], dtype=float32)

这篇关于ResNet:训练期间准确率为 100%,但相同数据的预测准确率为 33%的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆