Tensorflow LSTM中的c_state和m_state是什么? [英] What are c_state and m_state in Tensorflow LSTM?

查看:450
本文介绍了Tensorflow LSTM中的c_state和m_state是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

tf.nn.rnn_cell.LSTMCell的Tensorflow r0.12文档将其描述为init:

Tensorflow r0.12's documentation for tf.nn.rnn_cell.LSTMCell describes this as the init:

tf.nn.rnn_cell.LSTMCell.__call__(inputs, state, scope=None)

其中state如下:

状态:如果state_is_tuple为False,则必须为状态Tensor,二维,批处理x state_size.如果state_is_tuple为True,则它必须是列大小为c_state和m_state的二维状态Tensors的元组.

state: if state_is_tuple is False, this must be a state Tensor, 2-D, batch x state_size. If state_is_tuple is True, this must be a tuple of state Tensors, both 2-D, with column sizes c_state and m_state.

什么区域c_statem_state,它们如何适合LSTM?我在文档的任何地方都找不到对它们的引用.

What aare c_state and m_state and how do they fit into LSTMs? I cannot find reference to them anywhere in the documentation.

推荐答案

我偶然发现了一个相同的问题,这就是我的理解方式!简约的LSTM示例:

I've stumbled upon same question, here's how I understand it! Minimalistic LSTM example:

import tensorflow as tf

sample_input = tf.constant([[1,2,3]],dtype=tf.float32)

LSTM_CELL_SIZE = 2

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=True)
state = (tf.zeros([1,LSTM_CELL_SIZE]),)*2

output, state_new = lstm_cell(sample_input, state)

init_op = tf.global_variables_initializer()

sess = tf.Session()
sess.run(init_op)
print sess.run(output)

请注意state_is_tuple=True,因此在将state传递给此cell时,它必须采用tuple形式. c_statem_state可能是内存状态"和单元状态",尽管老实说我不确定,因为这些术语仅在文档中提及.在有关LSTM的代码和论文中,字母hc通常用于表示输出值"和单元状态". http://colah.github.io/posts/2015-08-Understanding- LSTM/ 这些张量表示单元的组合内部状态,应该一起传递.这样做的旧方法是简单地将它们连接起来,而新方法是使用元组.

Notice that state_is_tuple=True so when passing state to this cell, it needs to be in the tuple form. c_state and m_state are probably "Memory State" and "Cell State", though I honestly am NOT sure, as these terms are only mentioned in the docs. In the code and papers about LSTM - letters h and c are commonly used to denote "output value" and "cell state". http://colah.github.io/posts/2015-08-Understanding-LSTMs/ Those tensors represent combined internal state of the cell, and should be passed together. Old way to do it was to simply concatenate them, and new way is to use tuples.

旧方法:

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=False)
state = tf.zeros([1,LSTM_CELL_SIZE*2])

output, state_new = lstm_cell(sample_input, state)

新方法:

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE, state_is_tuple=True)
state = (tf.zeros([1,LSTM_CELL_SIZE]),)*2

output, state_new = lstm_cell(sample_input, state)

因此,基本上我们所做的一切,都将state从长度为4的1个张量更改为长度为2的两个张量.内容保持不变. [0,0,0,0]变为([0,0],[0,0]). (这应该使其速度更快)

So, basically all we did, is changed state from being 1 tensor of length 4 into two tensors of length 2. The content remained the same. [0,0,0,0] becomes ([0,0],[0,0]). (This is supposed to make it faster)

这篇关于Tensorflow LSTM中的c_state和m_state是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆