了解Keras LSTM [英] Understanding Keras LSTMs

查看:256
本文介绍了了解Keras LSTM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图调和对LSTM的理解,并在此处指出Christopher Olah的帖子在Keras中实施.我正在关注由Jason Brownlee撰写的博客适用于Keras教程.我主要感到困惑的是,

I am trying to reconcile my understand of LSTMs and pointed out here in this post by Christopher Olah implemented in Keras. I am following the blog written by Jason Brownlee for the Keras tutorial. What I am mainly confused about is,

  1. 将数据系列重塑为[samples, time steps, features]
  2. 状态LSTM

让我们参考下面粘贴的代码集中讨论以上两个问题:

Lets concentrate on the above two questions with reference to the code pasted below:

# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)

# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], look_back, 1))
testX = numpy.reshape(testX, (testX.shape[0], look_back, 1))
########################
# The IMPORTANT BIT
##########################
# create and fit the LSTM network
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(100):
    model.fit(trainX, trainY, nb_epoch=1, batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()

注意:create_dataset接受一个长度为N的序列,并返回一个N-look_back数组,其中每个元素都是一个look_back长度序列.

Note: create_dataset takes a sequence of length N and returns a N-look_back array of which each element is a look_back length sequence.

可以看出TrainX是一个3D数组,其中Time_steps和Feature是最后两个维度(在此特定代码中为3和1).对于下图,这是否意味着我们正在考虑many to one情况,其中粉红色盒子的数量为3?还是字面上的意思是链长为3(即仅考虑了3个绿色框).

As can be seen TrainX is a 3-D array with Time_steps and Feature being the last two dimensions respectively (3 and 1 in this particular code). With respect to the image below, does this mean that we are considering the many to one case, where the number of pink boxes are 3? Or does it literally mean the chain length is 3 (i.e. only 3 green boxes considered).

当我们考虑多元序列时,features自变量是否有意义?例如同时建模两个金融股票?

Does the features argument become relevant when we consider multivariate series? e.g. modelling two financial stocks simultaneously?

有状态LSTM是否意味着我们在批次运行之间保存单元内存值?在这种情况下,batch_size为1,并且在两次训练之间将内存重置,因此说出它是有状态的意思是什么.我猜想这与训练数据没有被改组这一事实有关,但是我不确定如何做到这一点.

Does stateful LSTMs mean that we save the cell memory values between runs of batches? If this is the case, batch_size is one, and the memory is reset between the training runs so what was the point of saying that it was stateful. I'm guessing this is related to the fact that training data is not shuffled, but I'm not sure how.

有什么想法吗? 图片参考: http://karpathy.github.io/2015/05/21 /rnn-efficiency/

Any thoughts? Image reference: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

@van对红色和绿色方框相等的评论感到有些困惑.因此,为了确认一下,以下API调用是否与展开的图相对应?尤其要注意第二张图(任意选择batch_size.):

A bit confused about @van's comment about the red and green boxes being equal. So just to confirm, does the following API calls correspond to the unrolled diagrams? Especially noting the second diagram (batch_size was arbitrarily chosen.):

对于已经完成Udacity深度学习课程但仍对time_step参数感到困惑的人,请查看以下讨论:

For people who have done Udacity's deep learning course and still confused about the time_step argument, look at the following discussion: https://discussions.udacity.com/t/rnn-lstm-use-implementation/163169

事实证明,model.add(TimeDistributed(Dense(vocab_len)))是我一直在寻找的东西.这是一个示例: https://github.com/sachinruk/ShakespeareBot

It turns out model.add(TimeDistributed(Dense(vocab_len))) was what I was looking for. Here is an example: https://github.com/sachinruk/ShakespeareBot

我在这里总结了我对LSTM的大部分理解: https://www.youtube.com /watch?v = ywinX5wgdEU

I have summarised most of my understanding of LSTMs here: https://www.youtube.com/watch?v=ywinX5wgdEU

推荐答案

首先,您选择了不错的教程( 2 )开始.

First of all, you choose great tutorials(1,2) to start.

时间步长意味着什么:X.shape中的Time-steps==3(描述数据形状)表示有三个粉红色的框.由于在Keras中,每个步骤都需要输入,因此绿色框的数量通常应等于红色框的数量.除非您破解该结构.

What Time-step means: Time-steps==3 in X.shape (Describing data shape) means there are three pink boxes. Since in Keras each step requires an input, therefore the number of the green boxes should usually equal to the number of red boxes. Unless you hack the structure.

多对多与多对一:在keras中,初始化LSTMGRUSimpleRNN时有一个return_sequences参数.如图所示,当return_sequencesFalse(默认情况下)时,它为一对多.其返回形状为(batch_size, hidden_unit_length),代表最后一个状态.当return_sequencesTrue时,则为多对多.它的返回形状为(batch_size, time_step, hidden_unit_length)

many to many vs. many to one: In keras, there is a return_sequences parameter when your initializing LSTM or GRU or SimpleRNN. When return_sequences is False (by default), then it is many to one as shown in the picture. Its return shape is (batch_size, hidden_unit_length), which represent the last state. When return_sequences is True, then it is many to many. Its return shape is (batch_size, time_step, hidden_unit_length)

要素参数是否相关:要素参数表示红色框有多大" 或每个步骤的输入维数是多少.例如,如果您要从8种市场信息中进行预测,则可以使用feature==8生成数据.

Does the features argument become relevant: Feature argument means "How big is your red box" or what is the input dimension each step. If you want to predict from, say, 8 kinds of market information, then you can generate your data with feature==8.

有状态:您可以查找源代码.初始化状态时,如果stateful==True,则将从上次训练获得的状态用作初始状态,否则将生成新状态.我还没有打开stateful.但是,我不同意stateful==Truebatch_size只能为1.

Stateful: You can look up the source code. When initializing the state, if stateful==True, then the state from last training will be used as the initial state, otherwise it will generate a new state. I haven't turn on stateful yet. However, I disagree with that the batch_size can only be 1 when stateful==True.

当前,您使用收集的数据生成数据.对库存信息进行流式处理,而不是等待一天收集所有顺序的图像,您想在进行网络训练/预测时在线生成输入数据.如果您有400只股票共享同一网络,则可以设置batch_size==400.

Currently, you generate your data with collected data. Image your stock information is coming as stream, rather than waiting for a day to collect all sequential, you would like to generate input data online while training/predicting with network. If you have 400 stocks sharing a same network, then you can set batch_size==400.

这篇关于了解Keras LSTM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆