了解我的LSTM模型的结构 [英] Understanding the structure of my LSTM model

查看:294
本文介绍了了解我的LSTM模型的结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解决以下问题:

I'm trying to solve the following problem:

我有许多设备的时间序列数据.每个设备记录的长度为3000.捕获的每个数据点都有4个测量值.因此我的数据是整形的(设备记录的数量为3000、4).

I have time series data from a number of devices. each device recording is of length 3000. Every datapoint captured has 4 measurements. so my data is shaped (number of device recordings, 3000, 4).

我正在尝试生成一个长度为3000的矢量,其中的每个数据点是3个标签(y1,y2,y3)之一,所以我想要的输出暗淡是(设备记录数,3000、1).我已经标记了要训练的数据.

I'm trying produce a vector of length 3000 where each data point of is one of 3 labels (y1, y2, y3), so my desired output dim is (number of device recording, 3000, 1). I have labeled data for training.

我正在为此尝试使用LSTM模型,因为沿时间序列数据移动时的分类"似乎是一种RNN类型的问题.

I'm trying to use an LSTM model for this, as 'classification as I move along time series data' seems like a RNN type of problem.

我的网络设置如下:

model = Sequential()
model.add(LSTM(3, input_shape=(3000, 4), return_sequences=True))
model.add(LSTM(3, activation = 'softmax', return_sequences=True))

model.summary()

摘要如下:

Model: "sequential_23"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_29 (LSTM)               (None, 3000, 3)           96        
_________________________________________________________________
lstm_30 (LSTM)               (None, 3000, 3)           84        
=================================================================
Total params: 180
Trainable params: 180
Non-trainable params: 0
_________________________________________________________________

在输出空间中,一切看起来都很好,因为我可以使用每个单元的结果来确定我的三个类别中的哪一个属于那个特定的时间步长(我认为).

All looks good and well in the output space, as I can use the result from each unit to determine which of my three categories belongs to that particular time step (I think).

但是我只有180个可训练的参数,所以我猜测我做错了什么.

But I only have 180 trainable parameters, so I'm guessing that I am doing something horribly wrong.

有人可以帮助我理解为什么我的训练参数这么少吗?我是否误解了如何设置此LSTM?我只是担心什么吗?

Can someone help me understand why I have so few trainable parameters? Am I misinterpreting how to set up this LSTM? Am I just worrying over nothing?

这3个单位是否意味着我只有3个LSTM'块'?而且它只能回顾3个观察结果?

Does that 3 units mean I only have 3 LSTM 'blocks'? and that it can only look back 3 observations?

推荐答案

从简单的角度看,您可以将LSTM层视为具有内存的增强型Dense层(因此启用有效处理序列).因此,单位"的概念在这两个方面也相同:这些层的神经元特征单位的数量,换句话说,这些层的独特特征的数量图层可以从输入中提取.

In a simplistic viewpoint, you can consider a LSTM layer as an augmented Dense layer with a memory (hence enabling efficient processing of sequences). So the concept of "units" is also the same for both: the number of neurons or feature units of these layers, or in other words, the number of distinctive features these layers can extract from the input.

因此,当您为LSTM层指定单位数量为3时,或多或少意味着该层只能从输入时间步中提取3个不同的特征(请注意,单位数量无关紧要输入序列的长度,即整个输入序列将由LSTM层处理,无论单位数量或输入序列的长度是多少.

Therefore, when you specify the number of units to 3 for the LSTM layer, more or less it means that this layer can only extract 3 distinctive features from the input timesteps (note that the number of units has nothing to do with the length of input sequence, i.e. the entire input sequence will be processed by the LSTM layer no matter what the number of units or the length of input sequence is).

通常,这可能不是最佳选择(尽管,这实际上取决于特定问题和正在处理的数据集的难度;例如,对于您的问题/数据集,也许3个单位就足够了,您应该尝试查找出去).因此,通常会为单位数选择一个更大的数字(常见选择:32、64、128、256),并且将分类任务委托给位于以下位置的专用Dense层(有时称为"softmax层")模型的顶部.

Usually, this might be sub-optimal (though, it really depends on the difficulty of the specific problem and dataset you are working on; i.e. maybe 3 units might be enough for your problem/dataset, and you should experiment to find out). Therefore, often a higher number is chosen for the number of units (common choices: 32, 64, 128, 256), and also the classification task is delegated to a dedicated Dense layer (or sometimes called "softmax layer") at the top of the model.

例如,考虑到对问题的描述,顶部有3个堆叠的LSTM层和Dense分类层的模型可能看起来像这样:

For example, considering the description of your problem, a model with 3 stacked LSTM layers and a Dense classification layer at the top might look like this:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(3000, 4)))
model.add(LSTM(64, return_sequences=True))
model.add(LSTM(32, return_sequences=True))
model.add(Dense(3, activation = 'softmax'))

这篇关于了解我的LSTM模型的结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆