Tensorflow LSTM 中的 c_state 和 m_state 是什么? [英] What are c_state and m_state in Tensorflow LSTM?

查看:17
本文介绍了Tensorflow LSTM 中的 c_state 和 m_state 是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TensorFlow r0.12 的 tf.nn.rnn_cell.LSTMCell 文档将其描述为初始化:

Tensorflow r0.12's documentation for tf.nn.rnn_cell.LSTMCell describes this as the init:

tf.nn.rnn_cell.LSTMCell.__call__(inputs, state, scope=None)

其中state如下:

state:如果 state_is_tuple 为 False,则这必须是状态张量,二维,批次 x state_size.如果 state_is_tuple 为 True,则这必须是状态张量的元组,都是二维的,列大小为 c_state 和 m_state.

state: if state_is_tuple is False, this must be a state Tensor, 2-D, batch x state_size. If state_is_tuple is True, this must be a tuple of state Tensors, both 2-D, with column sizes c_state and m_state.

什么是 c_statem_state 以及它们如何适应 LSTM?我在文档的任何地方都找不到对它们的引用.

What aare c_state and m_state and how do they fit into LSTMs? I cannot find reference to them anywhere in the documentation.

这是文档中该页面的链接.

推荐答案

我偶然发现了同样的问题,这就是我的理解!极简 LSTM 示例:

I've stumbled upon same question, here's how I understand it! Minimalistic LSTM example:

import tensorflow as tf

sample_input = tf.constant([[1,2,3]],dtype=tf.float32)

LSTM_CELL_SIZE = 2

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=True)
state = (tf.zeros([1,LSTM_CELL_SIZE]),)*2

output, state_new = lstm_cell(sample_input, state)

init_op = tf.global_variables_initializer()

sess = tf.Session()
sess.run(init_op)
print sess.run(output)

注意state_is_tuple=True,所以当将state传递给这个cell时,它需要在tuple中> 表格.c_statem_state 可能是内存状态"和细胞状态",但老实说我不确定,因为这些术语只在文档中提到.在关于LSTM的代码和论文中——字母hc通常用来表示输出值"和细胞状态".http://colah.github.io/posts/2015-08-Understanding-LSTMs/这些张量代表单元格的组合内部状态,应该一起传递.旧方法是简单地连接它们,新方法是使用元组.

Notice that state_is_tuple=True so when passing state to this cell, it needs to be in the tuple form. c_state and m_state are probably "Memory State" and "Cell State", though I honestly am NOT sure, as these terms are only mentioned in the docs. In the code and papers about LSTM - letters h and c are commonly used to denote "output value" and "cell state". http://colah.github.io/posts/2015-08-Understanding-LSTMs/ Those tensors represent combined internal state of the cell, and should be passed together. Old way to do it was to simply concatenate them, and new way is to use tuples.

老方法:

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=False)
state = tf.zeros([1,LSTM_CELL_SIZE*2])

output, state_new = lstm_cell(sample_input, state)

新方法:

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=True)
state = (tf.zeros([1,LSTM_CELL_SIZE]),)*2

output, state_new = lstm_cell(sample_input, state)

所以,基本上我们所做的就是将 state 从长度为 4 的 1 张量更改为长度为 2 的两个张量.内容保持不变.[0,0,0,0] 变成 ([0,0],[0,0]).(这应该让它更快)

So, basically all we did, is changed state from being 1 tensor of length 4 into two tensors of length 2. The content remained the same. [0,0,0,0] becomes ([0,0],[0,0]). (This is supposed to make it faster)

这篇关于Tensorflow LSTM 中的 c_state 和 m_state 是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆