如何在TensorFlow中创建端执行基本LSTM网络? [英] How to create end execute a basic LSTM network in TensorFlow?
问题描述
我怎么做TensorFlow?
添加
到目前为止,我可以运行以下代码:
num_units = 4
lstm = tf.nn.rnn_cell.LSTMCell(num_units = num_units)
timesteps = 18
num_input = 5
X = tf.placeholder("float", [None, timesteps, num_input])
x = tf.unstack(X, timesteps, 1)
outputs, states = tf.contrib.rnn.static_rnn(lstm, x, dtype=tf.float32)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
x_val = np.random.normal(size = (12,18,5))
res = sess.run(outputs, feed_dict = {X:x_val})
sess.close()
但是,有很多悬而未决的问题:
- 为什么要预设时间步数? LSTM不能接受任意长度的序列吗?
- 我们为什么要按时间步长拆分数据(使用unstack)?
- 如何解释输出"和状态"?
为什么要预设时间步数? LSTM是否应该接受 任意长度的序列?
如果要接受任意长度的序列,建议使用 我们是否按时间步长(使用unstack)拆分数据? 只需 如何解释输出"和状态"? 在LSTMCell中, 以同样的方式,您将在GRUCell中得到 I want to create a basic LSTM network that accept sequences of 5 dimensional vectors (for example as a N x 5 arrays) and returns the corresponding sequences of 4 dimensional hidden- and cell-vectors (N x 4 arrays), where N is the number of time steps. How can I do it TensorFlow? ADDED So, far I got the following code working: However, there are many open questions:
Why number of time steps is preset? Shouldn't LSTM be able to accept
sequences of arbitrary length? If you want to accept sequences of arbitrary length, I recommend using For example: We do we split data by time-steps (using unstack)? Just How to interpret the "outputs" and "states"? The shape of In the same way, You will get the shape of 这篇关于如何在TensorFlow中创建端执行基本LSTM网络?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!dynamic_rnn
.您可以参考dynamic_rnn
在一批中需要相同的长度,但是当您需要在一批中具有任意长度时,您可以在填充批数据之后使用sequence_length
参数指定每个长度.
static_rnn
需要使用unstack
拆分数据,这取决于它们的不同输入要求. static_rnn
的输入形状是[timesteps,batch_size, features]
,这是形状为[batch_size, features]
的2D张量的列表.但是dynamic_rnn
的输入形状是[timesteps,batch_size, features]
或[batch_size,timesteps, features]
,具体取决于time_major
是True还是False.
states
的形状为[2,batch_size,num_units ]
,一个[batch_size, num_units ]
表示 C ,另一个[batch_size, num_units ]
表示 h .您可以在下面看到图片.states
的形状为[batch_size, num_units ]
.outputs
表示每个时间步的输出,因此默认情况下(time_major = False)其形状为[batch_size, timesteps, num_units]
.您可以轻松得出结论
state[1, batch_size, : ] == outputs[ batch_size, -1, : ]
.num_units = 4
lstm = tf.nn.rnn_cell.LSTMCell(num_units = num_units)
timesteps = 18
num_input = 5
X = tf.placeholder("float", [None, timesteps, num_input])
x = tf.unstack(X, timesteps, 1)
outputs, states = tf.contrib.rnn.static_rnn(lstm, x, dtype=tf.float32)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
x_val = np.random.normal(size = (12,18,5))
res = sess.run(outputs, feed_dict = {X:x_val})
sess.close()
dynamic_rnn
.You can refer here to understand the difference between them.num_units = 4
lstm = tf.nn.rnn_cell.LSTMCell(num_units = num_units)
num_input = 5
X = tf.placeholder("float", [None, None, num_input])
outputs, states = tf.nn.dynamic_rnn(lstm, X, dtype=tf.float32)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
x_val = np.random.normal(size = (12,18,5))
res = sess.run(outputs, feed_dict = {X:x_val})
x_val = np.random.normal(size = (12,16,5))
res = sess.run(outputs, feed_dict = {X:x_val})
sess.close()
dynamic_rnn
require same length in one batch , but you can specify every length using the sequence_length
parameter after you pad batch data when you need arbitrary length in one batch.
static_rnn
needs to split data with unstack
,this depending on their different input requirements. The input shape of static_rnn
is [timesteps,batch_size, features]
, which is a list of 2D tensors of shape [batch_size, features]
. But the input shape of dynamic_rnn
is either [timesteps,batch_size, features]
or [batch_size,timesteps, features]
depending on time_major
is True or False.
states
is [2,batch_size,num_units ]
in LSTMCell, one [batch_size, num_units ]
represents C and the other [batch_size, num_units ]
represents h. You can see pictures below.states
is [batch_size, num_units ]
in GRUCell.outputs
represents the output of each time step, so by default(time_major=False) its shape is [batch_size, timesteps, num_units]
. And You can easily conclude that
state[1, batch_size, : ] == outputs[ batch_size, -1, : ]
.