tf.layers.conv2d 和 tf.layers.dense 中的默认内核初始值设定项是什么? [英] What is the default kernel initializer in tf.layers.conv2d and tf.layers.dense?
问题描述
官方 Tensorflow API 文档声称 tf.layers.conv2d
和 tf.layers 的参数
.kernel_initializer
默认为 None
.dense
The official Tensorflow API doc claims that the parameter kernel_initializer
defaults to None
for tf.layers.conv2d
and tf.layers.dense
.
但是,阅读图层教程 (https://www.tensorflow.org/tutorials/layers),我注意到代码中没有设置这个参数.例如:
However, reading the layers tutorial (https://www.tensorflow.org/tutorials/layers), I noted that this parameter is not set in the code. For example:
# Convolutional Layer #1
conv1 = tf.layers.conv2d(
inputs=input_layer,
filters=32,
kernel_size=[5, 5],
padding="same",
activation=tf.nn.relu)
教程中的示例代码运行时没有任何错误,所以我认为默认的 kernel_initializer
不是 None
.那么,使用哪个初始化程序?
The example code from the tutorial runs without any errors, so I think the default kernel_initializer
is not None
. So, which initializer is used?
在另一个代码中,我没有设置conv2d和dense层的kernel_initializer
,一切正常.但是,当我尝试将 kernel_initializer
设置为 tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32)
时,出现了 NaN 错误.这里发生了什么?有人可以帮忙吗?
In another code, I did not set the kernel_initializer
of the conv2d and dense layers, and everything was fine. However, when I tried to set the kernel_initializer
to tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32)
, I got NaN errors. What is going on here? Can anyone help?
推荐答案
好问题!找出来真是一个技巧!
Great question! It is quite a trick to find out!
- 如您所见,
tf.layers 中没有记录.conv2d
- 如果你看一下函数的定义 你看到函数调用了
variable_scope.get_variable
:
- As you can see, it is not documented in
tf.layers.conv2d
- If you look at the definition of the function you see that the function calls
variable_scope.get_variable
:
在代码中:
self.kernel = vs.get_variable('kernel',
shape=kernel_shape,
initializer=self.kernel_initializer,
regularizer=self.kernel_regularizer,
trainable=True,
dtype=self.dtype)
下一步:变量作用域在什么时候做什么初始值设定项是 None 吗?
这里说:
如果 initializer 为 None
(默认),则传入的默认初始化器使用构造函数.如果那个也是 None
,我们使用一个新的glorot_uniform_initializer
.
If initializer is
None
(the default), the default initializer passed in the constructor is used. If that one isNone
too, we use a newglorot_uniform_initializer
.
所以答案是:它使用glorot_uniform_initializer
So the answer is: it uses the glorot_uniform_initializer
为了完整起见,这个初始化器的定义:
For completeness the definition of this initializer:
Glorot 统一初始化器,也称为 Xavier 统一初始化器.它从 [-limit, limit] 内的均匀分布中抽取样本其中 limit
是 sqrt(6/(fan_in + fan_out))
其中 fan_in
是权重张量中的输入单元数fan_out
是权重张量中输出单元的数量.参考:http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
The Glorot uniform initializer, also called Xavier uniform initializer. It draws samples from a uniform distribution within [-limit, limit] where
limit
issqrt(6 / (fan_in + fan_out))
wherefan_in
is the number of input units in the weight tensor andfan_out
is the number of output units in the weight tensor. Reference: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
这是我在代码和文档中找到的内容.也许你可以通过对权重运行 eval 来验证初始化看起来像这样!
this is what I found in the code and documentation. Perhaps you could verify that the initialization looks like this by running eval on the weights!
这篇关于tf.layers.conv2d 和 tf.layers.dense 中的默认内核初始值设定项是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!