Tensorflow LSTM中的c_state和m_state是什么? [英] What are c_state and m_state in Tensorflow LSTM?
问题描述
tf.nn.rnn_cell.LSTMCell的Tensorflow r0.12文档将其描述为init:
Tensorflow r0.12's documentation for tf.nn.rnn_cell.LSTMCell describes this as the init:
tf.nn.rnn_cell.LSTMCell.__call__(inputs, state, scope=None)
其中state
如下:
状态:如果state_is_tuple为False,则必须为状态Tensor,二维,批处理x state_size.如果state_is_tuple为True,则它必须是列大小为c_state和m_state的二维状态Tensors的元组.
state: if state_is_tuple is False, this must be a state Tensor, 2-D, batch x state_size. If state_is_tuple is True, this must be a tuple of state Tensors, both 2-D, with column sizes c_state and m_state.
什么区域c_state
和m_state
,它们如何适合LSTM?我在文档的任何地方都找不到对它们的引用.
What aare c_state
and m_state
and how do they fit into LSTMs? I cannot find reference to them anywhere in the documentation.
推荐答案
我偶然发现了一个相同的问题,这就是我的理解方式!简约的LSTM示例: I've stumbled upon same question, here's how I understand it! Minimalistic LSTM example: 请注意 Notice that 旧方法: 新方法: 因此,基本上我们所做的一切,都将 So, basically all we did, is changed 这篇关于Tensorflow LSTM中的c_state和m_state是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!import tensorflow as tf
sample_input = tf.constant([[1,2,3]],dtype=tf.float32)
LSTM_CELL_SIZE = 2
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=True)
state = (tf.zeros([1,LSTM_CELL_SIZE]),)*2
output, state_new = lstm_cell(sample_input, state)
init_op = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init_op)
print sess.run(output)
state_is_tuple=True
,因此在将state
传递给此cell
时,它必须采用tuple
形式. c_state
和m_state
可能是内存状态"和单元状态",尽管老实说我不确定,因为这些术语仅在文档中提及.在有关LSTM
的代码和论文中,字母h
和c
通常用于表示输出值"和单元状态".
http://colah.github.io/posts/2015-08-Understanding- LSTM/
这些张量表示单元的组合内部状态,应该一起传递.这样做的旧方法是简单地将它们连接起来,而新方法是使用元组.state_is_tuple=True
so when passing state
to this cell
, it needs to be in the tuple
form. c_state
and m_state
are probably "Memory State" and "Cell State", though I honestly am NOT sure, as these terms are only mentioned in the docs. In the code and papers about LSTM
- letters h
and c
are commonly used to denote "output value" and "cell state".
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Those tensors represent combined internal state of the cell, and should be passed together. Old way to do it was to simply concatenate them, and new way is to use tuples.lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=False)
state = tf.zeros([1,LSTM_CELL_SIZE*2])
output, state_new = lstm_cell(sample_input, state)
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=True)
state = (tf.zeros([1,LSTM_CELL_SIZE]),)*2
output, state_new = lstm_cell(sample_input, state)
state
从长度为4
的1个张量更改为长度为2
的两个张量.内容保持不变. [0,0,0,0]
变为([0,0],[0,0])
. (这应该使其速度更快)state
from being 1 tensor of length 4
into two tensors of length 2
. The content remained the same. [0,0,0,0]
becomes ([0,0],[0,0])
. (This is supposed to make it faster)