tf.layers.conv2d 和 tf.layers.dense 中的默认内核初始值设定项是什么? [英] What is the default kernel initializer in tf.layers.conv2d and tf.layers.dense?

查看:53
本文介绍了tf.layers.conv2d 和 tf.layers.dense 中的默认内核初始值设定项是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

官方 Tensorflow API 文档声称 tf.layers.conv2dtf.layers 的参数 kernel_initializer 默认为 None.dense.

The official Tensorflow API doc claims that the parameter kernel_initializer defaults to None for tf.layers.conv2d and tf.layers.dense.

但是,阅读图层教程 (https://www.tensorflow.org/tutorials/layers),我注意到代码中没有设置这个参数.例如:

However, reading the layers tutorial (https://www.tensorflow.org/tutorials/layers), I noted that this parameter is not set in the code. For example:

# Convolutional Layer #1
conv1 = tf.layers.conv2d(
    inputs=input_layer,
    filters=32,
    kernel_size=[5, 5],
    padding="same",
    activation=tf.nn.relu)

教程中的示例代码运行时没有任何错误,所以我认为默认的 kernel_initializer 不是 None.那么,使用哪个初始化程序?

The example code from the tutorial runs without any errors, so I think the default kernel_initializer is not None. So, which initializer is used?

在另一个代码中,我没有设置conv2d和dense层的kernel_initializer,一切正常.但是,当我尝试将 kernel_initializer 设置为 tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32) 时,出现了 NaN 错误.这里发生了什么?有人可以帮忙吗?

In another code, I did not set the kernel_initializer of the conv2d and dense layers, and everything was fine. However, when I tried to set the kernel_initializer to tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32), I got NaN errors. What is going on here? Can anyone help?

推荐答案

好问题!找出来真是一个技巧!

Great question! It is quite a trick to find out!

  • As you can see, it is not documented in tf.layers.conv2d
  • If you look at the definition of the function you see that the function calls variable_scope.get_variable:

在代码中:

self.kernel = vs.get_variable('kernel',
                                  shape=kernel_shape,
                                  initializer=self.kernel_initializer,
                                  regularizer=self.kernel_regularizer,
                                  trainable=True,
                                  dtype=self.dtype)

下一步:变量作用域在什么时候做什么初始值设定项是 None 吗?

这里说:

如果 initializer 为 None(默认),则传入的默认初始化器使用构造函数.如果那个也是 None,我们使用一个新的glorot_uniform_initializer.

If initializer is None (the default), the default initializer passed in the constructor is used. If that one is None too, we use a new glorot_uniform_initializer.

所以答案是:它使用glorot_uniform_initializer

So the answer is: it uses the glorot_uniform_initializer

为了完整起见,这个初始化器的定义:

For completeness the definition of this initializer:

Glorot 统一初始化器,也称为 Xavier 统一初始化器.它从 [-limit, limit] 内的均匀分布中抽取样本其中 limitsqrt(6/(fan_in + fan_out))其中 fan_in 是权重张量中的输入单元数fan_out 是权重张量中输出单元的数量.参考:http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf

The Glorot uniform initializer, also called Xavier uniform initializer. It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor. Reference: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf

这是我在代码和文档中找到的内容.也许你可以通过对权重运行 eval 来验证初始化看起来像这样!

this is what I found in the code and documentation. Perhaps you could verify that the initialization looks like this by running eval on the weights!

这篇关于tf.layers.conv2d 和 tf.layers.dense 中的默认内核初始值设定项是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆