keras(lstm)-使用return_sequences = True时的必要形状 [英] keras (lstm) - necessary shape when using return_sequences=True
问题描述
我正在尝试使LSTM网络适合sin函数.目前,据我了解Keras,我的代码仅预测下一个值.根据此链接:多对多对于Keras中的许多LSTM示例,它都是多对一模型.但是,我的目标是实现多对多模型.基本上,我希望能够预测给定时间的10个值.当我尝试使用
return_sequences=True
(请参阅行model.add(..)
)(应该是解决方案),发生以下错误:
I am trying to fit an LSTM network to a sin function. Currently, as far as I understand Keras, my code does only predict the next value. According to this link: Many to one and many to many LSTM examples in Keras it is a many to one model. However, my goal is to implement a Many-to-many model. Basically, I want to be able to predict let's say 10 values, to a given time. When I am trying to use
return_sequences=True
(see line model.add(..)
), which is supposed to be the solution, the following error occurs:
ValueError: Error when checking target: expected lstm_8 to have 3 dimensions, but got array with shape (689, 1)
不幸的是,我完全不知道为什么会这样.是否有一般规则使用return_sequences=True
时输入形状需要如何?此外,我到底需要更改什么?感谢您的帮助.
Unfortunately, I have absolutely no clue why this happens. Is there a general rule how the input shape needs to be when using return_sequences=True
? Furthermore what exactly would I need to change? Thanks for any help.
import pandas
import numpy as np
import matplotlib.pylab as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import sklearn
from keras.models import Sequential
from keras.layers import Activation, LSTM
from keras import optimizers
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
#generate sin function with noise
x = np.arange(0, 100, 0.1)
noise = np.random.uniform(-0.1, 0.1, size=(1000,))
Y = np.sin(x) + noise
# Perform feature scaling
scaler = MinMaxScaler()
Y = scaler.fit_transform(Y.reshape(-1, 1))
# split in train and test
train_size = int(len(Y) * 0.7)
test_size = len(Y) - train_size
train, test = Y[0:train_size,:], Y[train_size:len(Y),:]
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
# reshape into X=t and Y=t+1
look_back = 10
X_train, y_train = create_dataset(train, look_back)
X_test, y_test = create_dataset(test, look_back)
# LSTM network expects the input data in form of [samples, time steps, features]
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
np.set_printoptions(threshold=np.inf)
# compile model
model = Sequential()
model.add(LSTM(1, input_shape=(look_back, 1)))#, return_sequences=True)) <== uncomment this
model.compile(loss='mean_squared_error', optimizer='adam')
SVG(model_to_dot(model).create(prog='dot', format='svg'))
model.fit(X_train, y_train, validation_data=(X_test, y_test),
batch_size=10, epochs=10, verbose=2)
prediction = model.predict(X_test, batch_size=1, verbose=0)
prediction.reshape(-1)
#Transform back to original representation
Y = scaler.inverse_transform(Y)
prediction = scaler.inverse_transform(prediction)
plt.plot(np.arange(0,Y.shape[0]), Y)
plt.plot(np.arange(Y.shape[0] - X_test.shape[0] , Y.shape[0]), prediction, 'red')
plt.show()
error = mean_squared_error(y_test, prediction)
print(error)
推荐答案
问题不是输入,而是输出. 该错误显示:检查 target 时出错",target = y_train和y_test.
The problem is not the input, but the output. The error says: "Error when checking target", target = y_train and y_test.
因为您的lstm返回一个序列(return_sequences = True),所以输出尺寸将为:(n_batch,lookback,1).
Because your lstm returns a sequence (return_sequences=True) the output dimention will be: (n_batch,lookback,1).
您可以使用model.summary()进行验证
You can verify it by using model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 10, 1) 12
=================================================================
Total params: 12
Trainable params: 12
Non-trainable params: 0
_________________________________________________________________
您将需要更改create_dataset函数,以便对每个基本事实进行调整(回溯,1).
You will need to change your create_dataset function so each ground truth will be shaped (lookback,1).
您可能想做的事情:
对于火车集中的每个序列x,其y将是下一个程序序列.
例如,假设我们想学点简单的东西,顺序将是前一个数字加1-> 1,2,3,4,5,6,7,8,9,10.
对于loockback = 4:
Something you might want to do:
for each seqeuence x in the train set,its y will be the next proceedings sequence.
For example, lets say we would like to learn something easier, the seqeuence will be the previous number plus 1 --> 1,2,3,4,5,6,7,8,9,10.
For loockback=4:
X_train[0] = 1,2,3,4
y_train[0] will be: 2,3,4,5
X_train[1] = 2,3,4,5
y_train[1] will be: 3,4,5,6
and so on...
这篇关于keras(lstm)-使用return_sequences = True时的必要形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!