为什么在测试模式期间在 tf.keras.layers.Dropout 中设置 training=True 会导致较低的训练损失值和较高的预测精度? [英] Why setting training=True in tf.keras.layers.Dropout during testing mode is leading to lower training loss values and higher prediction accuracy?

查看:24
本文介绍了为什么在测试模式期间在 tf.keras.layers.Dropout 中设置 training=True 会导致较低的训练损失值和较高的预测精度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 tensorflow (tf.keras.layers.Dropout) 中实现的模型上使用了 dropout 层.我在训练期间设置了training=True",在测试时设置了training=False".性能很差.我也在测试期间不小心更改了training=True",结果变得更好了.我想知道发生了什么?为什么它会影响训练损失值?因为我不会对培训进行任何更改,并且整个测试过程都在培训之后进行.但是,在测试中改变training=True"会影响训练过程,导致训练损失接近于零,然后测试结果会更好.有什么可能的解释吗?

I'm using dropout layers on my model implemented in tensorflow (tf.keras.layers.Dropout). I set the "training= True" during the training and "training=False" while testing. The performance is poor. I accidentally changed "training=True" during testing too, and the results got much better. I'm wondering what's happening? And why it is affecting the training loss values? Because I'm not making any changes to the training and the whole testing process happens after training. However, changing "training=True" in testing is affecting the training process and causing the training loss to get closer to zero and then the testing results are better. Any possible explanation?

谢谢,

推荐答案

抱歉回复晚了,Celius 的回答不太正确.

Sorry for the late response, but the answer from Celius is not quite correct.

Dropout 层(以及 BatchNormalization 层)的训练参数定义了该层应该在训练模式还是推理模式下运行.您可以在官方文档中阅读此内容.

The training parameter of the Dropout Layer (and for the BatchNormalization layer as well) defines whether this layer should behave in training or inference mode. You can read this in the official documentation.

但是,文档对于这如何影响网络的执行有点不清楚.设置 training=False 并不意味着 Dropout 层不是您网络的一部分.正如 Celius 解释的那样,它绝不会被忽略,但它只是在推理模式下运行.对于 Dropout,这意味着不会应用任何 dropout.对于 BN,这意味着 BN 将使用训练期间估计的参数,而不是为每个 mini-batch 计算新参数.这真的是.反过来,如果您设置 training=True,则该层将在训练模式下运行并应用 dropout.

However, the documentation is a bit unclear on how this affects the execution of your network. Setting training=False does not mean that the Dropout layer is not part of your network. It is by no means ignored as Celius explained, but it just behaves in inference mode. For Dropout, this means that no dropout will be applied. For BN, it means that BN will use the parameters estimated during training instead of computing new parameters for every mini-batch. This is really. The other way around, if you set training=True, the layer will behave in training mode and dropout will be applied.

现在回答您的问题:您的网络行为没有意义.如果将 dropout 应用于看不见的数据,则没有什么可从中学习的.你只会扔掉信息,因此你的结果应该更糟.但我认为您的问题无论如何都与 Dropout 层无关.您的网络是否也使用了 BatchNormalization 层?如果 BN 的应用方式不佳,它可能会破坏您的最终结果.但是我没有看到任何代码,因此很难按原样完整回答您的问题.

Now to your question: The behavior of your network does not make sense. If dropout was applied on unseen data, there is nothing to learn from that. You only throw away information, hence your results should be worse. But I think your problem is not related to the Dropout layer anyway. Does your network also make use of BatchNormalization layers? If BN is applied in a poor way, it can mess up your final results. But I haven't seen any code, so it is hard to fully answer your question as is.

这篇关于为什么在测试模式期间在 tf.keras.layers.Dropout 中设置 training=True 会导致较低的训练损失值和较高的预测精度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆