具有多个输入序列和对应的多个输出序列的LSTM [英] LSTM with multiple input sequences and corresponding multiple output sequences
问题描述
我有一个与重塑LSTM的输入/输出数据有关的问题.尽管有很多文章都在考虑这些问题,但我无法找到合适的解决方案.如果错误很明显,我深表歉意-我是深度学习领域的新手.
I have an issue related to reshaping input/output data for LSTM. While there are a lot of posts considering these issues, I couldn't come across to find a proper solution for this. My apologies if the mistake is quite obvious - I am rather new to the field of Deep Learning.
我的问题如下:我进行了一次仿真,得出了一些与时间相关的数据序列,这些序列想输入到LSTM网络中.数据(非常简化)如下所示:
My issue is as follows: I performed a simulation which resulted in several sequences of time dependent data which I'd like to feed into an LSTM-network. The data (very much simplified) looks as follows:
X=[[[8, 0, 18, 10]
[9, 0, 20, 7]
[7, 0, 17, 12]]
[[7, 0, 31, 8]
[5, 0, 22, 9]
[7, 0, 17, 12]]]
也就是说,我有两个序列,每个序列具有三个时间步长,每个时间步长有4个特征.因此,X的形状为(2,3,4).相应地,我想预测的内容如下
That is I have two sequences with three time steps each and 4 features per time step. Hence, the shape of X is (2,3,4). Correspondingly, what I would like to predict looks as follows
y=[[[10]
[7]
[12]]
[[8]
[9]
[12]]]
并具有形状(2,3,1).也就是说,假设数据点[8,0,18,10]可以预测[10],然后是点[9,0,20,7]可以预测7,依此类推.然后,我的模型如下所示:
and has shape (2,3,1). That is, the data point [8,0,18,10] is supposed to predict [10], followed by point [9,0,20,7] which should predict 7 and so on. My model then looks as follows:
model.add(LSTM(input_shape=(X.shape[1], X.shape[2]), return_sequences=True))
model.add(Dense(50, activation='tanh'))
model.add(Dense(1, activation='tanh'))
虽然这似乎可以正常工作,但我的结果却很糟糕.我最有可能认为这与正确重塑输出向量有关.另外,我不确定return_sequences是否必须为true.如果将其设置为False,则会收到错误消息"Expected density_2具有2个维,但是得到了形状为(2,3,1)的数组.对此不太确定. 我也正在研究Seq2Seq建模,因为我试图根据序列预测序列,但是找不到解决方法. 有人可以帮忙吗?
While this seems to work without errors, my result is quite bad. Most likely, I think this is related to reshaping the output vector correctly. Also, I am not quite sure about whether or not return_sequences has to be true or not. If it is set to False, I get the error message 'Expected dense_2 to have 2 dimensions, but got an array with shape (2,3,1). Not quite sure about this. I was also looking into Seq2Seq modelling since I am trying to predict a sequence based on a sequence, but I couldn't find a workaround. Can anybody help?
推荐答案
您可能正试图从'tanh'
获取大数字,而'tanh'
仅输出-1和1之间的数字.
You're probably trying to get big numbers from a 'tanh'
, which only outputs between -1 and 1.
例如,您不能使用tanh
达到10.
You can't reach 10 with tanh
, for instance.
要么用'linear'
替换最终激活(输出任何内容),要么将输出数据标准化为-1和1之内.
Either you replace the final activation with 'linear'
(outputs anything), or you normalize your output data to be within -1 and 1.
如果数据始终为正,则可以尝试使用'softplus'
而不是'linear'
;如果选择对数据进行规范化,则将其设置为0到1之间并使用'sigmoid'
.
If your data is always positive, you can try using 'softplus'
instead of 'linear'
, and if you opt for normalizing data, make it between 0 and 1 and use 'sigmoid'
.
这篇关于具有多个输入序列和对应的多个输出序列的LSTM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!