如何将Tensorflow BatchNormalization与GradientTape结合使用? [英] How to use Tensorflow BatchNormalization with GradientTape?

查看:233
本文介绍了如何将Tensorflow BatchNormalization与GradientTape结合使用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个使用BatchNormalization的简单Keras模型:

Suppose we have a simple Keras model that uses BatchNormalization:

model = tf.keras.Sequential([
                     tf.keras.layers.InputLayer(input_shape=(1,)),
                     tf.keras.layers.BatchNormalization()
])

如何在GradientTape中实际使用它?以下内容似乎不起作用,因为它不更新移动平均线吗?

How to actually use it with GradientTape? The following doesn't seem to work as it doesn't update the moving averages?

# model training... we want the output values to be close to 150
for i in range(1000):
  x = np.random.randint(100, 110, 10).astype(np.float32)
  with tf.GradientTape() as tape:
    y = model(np.expand_dims(x, axis=1))
    loss = tf.reduce_mean(tf.square(y - 150))
  grads = tape.gradient(loss, model.variables)
  opt.apply_gradients(zip(grads, model.variables))

尤其是,如果您检查移动平均值,它们将保持不变(检查model.variables,平均值始终为0和1).我知道可以使用.fit()和.predict(),但是我想使用GradientTape,但不确定如何执行此操作.该文档的某些版本建议更新update_ops,但似乎在急切模式下不起作用.

In particular, if you inspect the moving averages, they remain the same (inspect model.variables, averages are always 0 and 1). I know one can use .fit() and .predict(), but I would like to use the GradientTape and I'm not sure how to do this. Some version of the documentation suggests to update update_ops, but that doesn't seem to work in eager mode.

尤其是,经过上述训练后,以下代码将不会输出接近150的任何内容.

In particular, the following code will not output anything close to 150 after the above training.

x = np.random.randint(200, 210, 100).astype(np.float32)
print(model(np.expand_dims(x, axis=1)))

推荐答案

具有渐变磁带模式的BatchNormalization层应使用参数training = True调用

with gradient tape mode BatchNormalization layer should be called with argument training=True

示例:

inp = KL.Input( (64,64,3) )
x = inp
x = KL.Conv2D(3, kernel_size=3, padding='same')(x)
x = KL.BatchNormalization()(x, training=True)
model = KM.Model(inp, x)

然后正确地移动移动变量

then moving vars are properly updated

>>> model.layers[2].weights[2]
<tf.Variable 'batch_normalization/moving_mean:0' shape=(3,) dtype=float32, numpy
=array([-0.00062087,  0.00015137, -0.00013239], dtype=float32)>

这篇关于如何将Tensorflow BatchNormalization与GradientTape结合使用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆