如何建立LSTM网络以预测多序列? [英] How to set up LSTM network for predict multi-sequence?

查看:471
本文介绍了如何建立LSTM网络以预测多序列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习如何设置RNN-LSTM网络进行预测.我用一个输入变量创建了数据集.

I am learning how to set up the RNN-LSTM network for prediction. I have created the dataset with one input variable.

x  y
1  2.5
2  6
3  8.6
4  11.2
5  13.8
6  16.4
...

通过以下python代码,我创建了窗口数据,例如[x(t-2),x(t-1),x(t)]来预测[y(t)]:

By the following python code, I have created the window data, like [x(t-2), x(t-1), x(t)] to predict [y(t)]:

df= pd.read_excel('dataset.xlsx')

# split a univariate dataset into train/test sets
def split_dataset(data):
    train, test = data[:-328], data[-328:-6]
    return train, test

train, test  = split_dataset(df.values)

# scale train and test data
def scale(train, test):
    # fit scaler
    scaler = MinMaxScaler(feature_range=(0,1))
    scaler = scaler.fit(train)
    # transform train
    #train = train.reshape(train.shape[0], train.shape[1])
    train_scaled = scaler.transform(train)
    # transform test
    #test = test.reshape(test.shape[0], test.shape[1])
    test_scaled = scaler.transform(test)
    return scaler, train_scaled, test_scaled

scaler, train_scaled, test_scaled = scale(train, test)

def to_supervised(train, n_input, n_out=7):
    # flatten data
    data = train
    X, y = list(), list()
    in_start = 0
    # step over the entire history one time step at a time
    for _ in range(len(data)):
        # define the end of the input sequence
        in_end = in_start + n_input
        out_end = in_end + n_out
        # ensure we have enough data for this instance
        if out_end <= len(data):
            x_input = data[in_start:in_end, 0]
            x_input = x_input.reshape((len(x_input), 1))
            X.append(x_input)
            y.append(data[in_end:out_end, 0])
        # move along one time step
        in_start += 1
    return np.array(X), np.array(y)
train_x, train_y = to_supervised(train_scaled, n_input = 3, n_out = 1)
test_x, test_y =  to_supervised(test_scaled, n_input = 3, n_out = 1)

verbose, epochs, batch_size = 0, 20, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]


model = Sequential()
model.add(LSTM(200, return_sequences= False, input_shape = (train_x.shape[1],train_x.shape[2])))
model.add(Dense(1))
model.compile(loss = 'mse', optimizer = 'adam')
history = model.fit(train_x, train_y, epochs=epochs, verbose=verbose, validation_data = (test_x, test_y))

但是,我对此还有其他疑问:

However, I have other questions about this:

Q1:LSTM中的单位是什么意思? [model.add(LSTM(units,...))]

Q1: What is the meaning of units in LSTM? [model.add(LSTM(units, ...))]

(我为模型尝试了不同的单位,随着单位的增加,它会更加准确.)

(I have tried different units for the model, it would be more accurate as units increased.)

Q2:我应该设置几层?

Q2: How many layers should I set?

Q3:如何预测多步?例如基于[x(t),x(t-1))来预测y(t),y(t + 1)我试图在to_supervised函数中设置n_out = 2,但是当我应用相同的方法时,它返回了错误

Q3: How can I predict multi-steps ? e.g base on (x(t),x(t-1)) to predict y(t), y(t+1) I have tried to set the n_out = 2 in the to_supervised function, but when I applied the same method, it returned the error

train_x, train_y = to_supervised(train_scaled, n_input = 3, n_out = 2)
test_x, test_y =  to_supervised(test_scaled, n_input = 3, n_out = 2)

verbose, epochs, batch_size = 0, 20, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]

model = Sequential()
model.add(LSTM(200, return_sequences= False, input_shape = (train_x.shape[1],train_x.shape[2])))
model.add(Dense(1))
model.compile(loss = 'mse', optimizer = 'adam')
history = model.fit(train_x, train_y, epochs=epochs, verbose=verbose, validation_data = (test_x, test_y))

ValueError: Error when checking target: expected dense_27 to have shape (1,) but got array with shape (2,)

第3季度(续):我应该在模型设置中添加或更改什么?

Q3(cont): What should I add or change in the model setting?

Q3(cont):return_sequences是什么?我什么时候应该设置为True?

Q3(cont): What is the return_sequences ? When should I set True?

推荐答案

Re Q1:这是LSTM细胞(= LSTM单位)的数量,它由几个神经元本身组成,但仅具有(在给定的标准情况下)每个输出一个.因此,单位数量直接对应于输出的尺寸.

Re Q1: It is the number of LSTM cells (=LSTM units), which consist of several neurons themselves but have (in the standard case as given) only one output each. Thus, the number of units corresponds directly to the dimensionality of your output.

这篇关于如何建立LSTM网络以预测多序列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆