在动态 RNN 中设置初始状态 [英] Setting initial state in dynamic RNN

查看:69
本文介绍了在动态 RNN 中设置初始状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基于链接:

https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn

在示例中,显示初始状态"是在第一个示例中定义的,而不是在第二个示例中.谁能解释一下初始状态的目的是什么?如果我不设置与我设置区别有什么区别?是否仅在单个 RNN 单元中需要,而在链接中提供的示例中在堆叠单元中不需要?

In the example, it is shown that the "initial state" is defined in the first example and not in the second example. Could anyone please explain what is the purpose of the initial state? What's the difference if I don't set it vs if i set it? Is it only required in a single RNN cell and not in a stacked cell like in the example provided in the link?

我目前正在调试我的 RNN 模型,因为它似乎将不同的问题归入同一类别,这很奇怪.我怀疑这可能与我没有设置单元格的初始状态有关.

I'm currently debugging my RNN model, as it seemed to classify different questions in the same category, which is strange. I suspect that it might have to do with me not setting the initial state of the cell.

推荐答案

谁能解释一下初始状态的目的是什么?

Could anyone please explain what is the purpose of initial state?

正如我们所知,状态矩阵是时间步长 1 和时间步长 2 中隐藏神经元之间的权重.它们连接了两个时间步长的隐藏神经元.因此,它们在之前的时间步骤中保存来自层的时间数据.

As we know that the state matrix is the weights between the hidden neurons in timestep 1 and timestep 2. They join the hidden neurons of both the time steps. Hence they hold temporal data from the layers in previous time steps.

通过 initial_state= 参数提供初始训练的状态矩阵,使 RNN 单元能够训练记忆其先前的激活.

Providing an initially trained state matrix by the initial_state= argument gives the RNN cell a trained memory of its previous activations.

不设置和设置有什么区别?

What's the difference if I don't set it vs if I set it?

如果我们设置在其他模型或先前模型上训练过的初始权重,则意味着我们正在恢复 RNN 单元的记忆,使其不必从头开始.

If we set the initial weights which have been trained on some other model or the previous model, it means that we are restoring the memory of the RNN cell so that it does not have to start from scratch.

在 TF 文档中,他们将 initial_state 初始化为 zero_state 矩阵.

In the TF docs, they have initialized the initial_state as zero_state matrix.

如果不设置initial_state,它将像其他权重矩阵一样从头开始训练.

If you don't set the initial_state, it will be trained from scratch as other weight matrices do.

是否仅在单个 RNN 单元中需要,而不是在链接中提供的示例中的堆叠单元中?

Is it only required in a single RNN cell and not in a stacked cell like in the example provided in the link?

我完全不知道为什么他们没有在 Stacked RNN 示例中设置 initial_state,但是在每种类型的 RNN 中都需要 initial_state,因为它保存了跨时间的时间特征步骤.

I exactly don't know that why haven't they set the initial_state in the Stacked RNN example, but initial_state is required in every type of RNN as it holds the preserves the temporal features across time steps.

也许,Stacked RNN 是文档中的兴趣点,而不是 initial_state 的设置.

提示:

在大多数情况下,您不需要为 RNN 设置 initial_state.TensorFlow 可以为我们有效地处理这个问题.在 seq2seq RNN 的情况下,可以使用此属性.

In most cases, you will not need to set the initial_state for an RNN. TensorFlow can handle this efficiently for us. In the case of seq2seq RNN, this property may be used.

您的 RNN 可能面临其他一些问题.您的 RNN 会建立自己的内存,不需要加电.

Your RNN maybe facing some other issue. Your RNN build ups its own memory and doesn't require powerup.

这篇关于在动态 RNN 中设置初始状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆