张量流BasicLSTMCell中的num_units是什么? [英] What is num_units in tensorflow BasicLSTMCell?

查看:284
本文介绍了张量流BasicLSTMCell中的num_units是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在MNIST LSTM示例中,我不理解隐藏层"的含义.当您表示一段时间后展开的RNN时,它是虚构层吗?

In MNIST LSTM examples, I don't understand what "hidden layer" means. Is it the imaginary-layer formed when you represent an unrolled RNN over time?

为什么在大多数情况下num_units = 128是?

Why is the num_units = 128 in most cases ?

推荐答案

隐藏单元数直接表示神经网络的学习能力-它反映了学习参数的数量 .值128可能是任意或经验选择的.您可以实验性地更改该值,然后重新运行程序以查看它如何影响训练精度(隐藏的单元数很多可以达到90%的测试精度).使用更多的单位可以更好地记住完整的训练集(尽管这将花费更长的时间,并且存在过度拟合的风险).

The number of hidden units is a direct representation of the learning capacity of a neural network -- it reflects the number of learned parameters. The value 128 was likely selected arbitrarily or empirically. You can change that value experimentally and rerun the program to see how it affects the training accuracy (you can get better than 90% test accuracy with a lot fewer hidden units). Using more units makes it more likely to perfectly memorize the complete training set (although it will take longer, and you run the risk of over-fitting).

要理解的关键点,在著名的 Colah的博客中有些微妙post (找到每行包含一个完整的向量" ),是因为 X是数据的数组 (如今通常称为 张量 )-但这并不意味着标量值.例如,在显示tanh函数的地方,这意味着暗示该函数在整个数组中进行 broadcast 广播(隐式的for循环),而不是简单地每个函数执行一次时间步长.

The key thing to understand, which is somewhat subtle in the famous Colah's blog post (find "each line carries an entire vector"), is that X is an array of data (nowadays often called a tensor) -- it is not meant to be a scalar value. Where, for example, the tanh function is shown, it is meant to imply that the function is broadcast across the entire array (an implicit for loop) -- and not simply performed once per time-step.

这样,隐藏单元表示网络内的有形存储,主要体现在 weights 数组的大小上.而且,由于LSTM实际上确实有一部分自己的内部存储空间与学习的模型参数分开,因此它必须知道有多少个单位-最终需要与权重的大小一致.在最简单的情况下,RNN没有内部存储-因此,它甚至不需要预先知道要应用多少隐藏单位".

As such, the hidden units represent tangible storage within the network, which is manifest primarily in the size of the weights array. And because an LSTM actually does have a bit of it's own internal storage separate from the learned model parameters, it has to know how many units there are -- which ultimately needs to agree with the size of the weights. In the simplest case, an RNN has no internal storage -- so it doesn't even need to know in advance how many "hidden units" it is being applied to.

  • 此处.对于类似问题的好答案.
  • 您可以查看来源在TensorFlow中用于BasicLSTMCell,以准确了解其用法.
  • A good answer to a similar question here.
  • You can look at the source for BasicLSTMCell in TensorFlow to see exactly how this is used.

旁注:此符号在统计和机器学习中非常普遍,以及使用通用公式处理大量数据的其他字段(另一个示例是3D图形).对于希望看到他们的for循环被明确写出的人来说,这需要一点时间来适应.

Side note: This notation is very common in statistics and machine-learning, and other fields that process large batches of data with a common formula (3D graphics is another example). It takes a bit of getting used to for people who expect to see their for loops written out explicitly.

这篇关于张量流BasicLSTMCell中的num_units是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆