使用 tensorflow 训练时传递 `training=true` [英] Passing `training=true` when using doing tensorflow training

查看:150
本文介绍了使用 tensorflow 训练时传递 `training=true`的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TensorFlow 的官方教程说我们应该在训练期间传递 base_model(trainin=False) 以使 BN 层不更新均值和方差.我的问题是:为什么?为什么我们不需要更新均值和方差,我的意思是 BN 有 imagenet 的均值和方差,为什么使用 imagenet 的均值和方差而不是在新数据上更新它们是有用的?即使在微调期间,在这种情况下,整个模型都会更新权重,但 BN 层仍将具有 imagenet 均值和方差.我正在使用本教程:https://www.tensorflow.org/tutorials/images/转移学习

TensorFlow's official tutorial says that we should pass base_model(trainin=False) during training in order for the BN layer not to update mean and variance. my question is: why? why we don't need to update mean and variance, I mean BN has imagenet mean and variance and why it is useful to use imagenet's mean and variance, and not update them on new data? even during fine tunning, in this case whole model updates weights but BN layer still is going to have imagenet mean and variance. edit: i am using this tutorial :https://www.tensorflow.org/tutorials/images/transfer_learning

推荐答案

当模型从初始化开始训练时,应该启用 batchnorm 以调整它们的均值和方差,如您所提到的.微调或迁移学习有点不同:您已经拥有一个可以做比您需要的更多的模型,并且您想要对预训练模型执行特定的专业化以在您的数据集上完成您的任务/工作.在这种情况下,部分权重被冻结,只有最接近输出的一些层会发生变化.由于 BN 层在模型周围使用,因此您也应该冻结它们.再次检查这个解释:

When model is trained from initialization, batchnorm should be enabled to tune their mean and variance as you mentioned. Finetuning or transfer learning is a bit different thing: you already has a model that can do more than you need and you want to perform particular specialization of pre-trained model to do your task/work on your data set. In this case part of weights are frozen and only some layers closest to output are changed. Since BN layers are used all around model you should froze them as well. Check again this explanation:

关于 BatchNormalization 层的重要说明 许多模型包含tf.keras.layers.BatchNormalization 层.这一层是一个特殊的情况和预防措施应在微调的背景下,因为本教程稍后会显示.

Important note about BatchNormalization layers Many models contain tf.keras.layers.BatchNormalization layers. This layer is a special case and precautions should be taken in the context of fine-tuning, as shown later in this tutorial.

当设置 layer.trainable = False 时,BatchNormalization 层将在推理模式下运行,并且不会更新其均值和方差统计数据.

When you set layer.trainable = False, the BatchNormalization layer will run in inference mode, and will not update its mean and variance statistics.

当您解冻包含 BatchNormalization 层的模型时为了进行微调,您应该保留 BatchNormalization 层在推理模式下通过调用 base 时传递 training = False模型.否则,更新应用于不可训练的权重会破坏模型学到的东西.

When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the BatchNormalization layers in inference mode by passing training = False when calling the base model. Otherwise, the updates applied to the non-trainable weights will destroy what the model has learned.

来源:迁移学习,关于冻结

这篇关于使用 tensorflow 训练时传递 `training=true`的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆