批量标准化和辍学的顺序? [英] Ordering of batch normalization and dropout?

查看：22 发布时间：2021/12/9 22:23:11 python neural-network tensorflow conv-neural-network

本文介绍了批量标准化和辍学的顺序?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

最初的问题是专门针对 TensorFlow 实现的.但是，答案是针对一般实现的.这个通用答案也是 TensorFlow 的正确答案.

在 TensorFlow 中使用批量归一化和 dropout 时(特别是使用 contrib.layers)时，我需要担心排序吗?

When using batch normalization and dropout in TensorFlow (specifically using the contrib.layers) do I need to be worried about the ordering?

如果我使用 dropout 后立即进行批量标准化，似乎可能会出现问题.例如，如果批量归一化的转变训练到训练输出的较大尺度数，但随后将相同的转变应用于较小的(由于具有更多输出的补偿)尺度数而不会在测试期间丢失，那么班次可能已关闭.TensorFlow 批量归一化层是否会自动对此进行补偿?或者这不是因为我失踪的某种原因而发生的?

It seems possible that if I use dropout followed immediately by batch normalization there might be trouble. For example, if the shift in the batch normalization trains to the larger scale numbers of the training outputs, but then that same shift is applied to the smaller (due to the compensation for having more outputs) scale numbers without dropout during testing, then that shift may be off. Does the TensorFlow batch normalization layer automatically compensate for this? Or does this not happen for some reason I'm missing?

此外，将这两者结合使用时是否还有其他陷阱需要注意?例如，假设我按照上面的正确顺序使用它们(假设有一个正确的顺序)，在多个连续层上同时使用批量标准化和 dropout 会不会有问题?我没有立即发现这有什么问题，但我可能会遗漏一些东西.

Also, are there other pitfalls to look out for in when using these two together? For example, assuming I'm using them in the correct order in regards to the above (assuming there is a correct order), could there be trouble with using both batch normalization and dropout on multiple successive layers? I don't immediately see a problem with that, but I might be missing something.

非常感谢！

更新:

实验测试似乎表明顺序确实很重要.我只使用批处理规范和 dropout 反向运行了相同的网络两次.当 dropout 在批规范之前，验证损失似乎随着训练损失的下降而上升.在另一种情况下，他们都失败了.但就我而言，动作很慢，所以经过更多训练后情况可能会发生变化，这只是一次测试.一个更明确和更明智的答案仍将不胜感激.

An experimental test seems to suggest that ordering does matter. I ran the same network twice with only the batch norm and dropout reverse. When the dropout is before the batch norm, validation loss seems to be going up as training loss is going down. They're both going down in the other case. But in my case the movements are slow, so things may change after more training and it's just a single test. A more definitive and informed answer would still be appreciated.

批量标准化和辍学的顺序? [英] Ordering of batch normalization and dropout?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

批量标准化和辍学的顺序? [英] Ordering of batch normalization and dropout?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭