为 LSTM 整形数据,并将密集层的输出馈送到 LSTM [英] Shaping data for LSTM, and feeding output of dense layers to LSTM

查看:21
本文介绍了为 LSTM 整形数据,并将密集层的输出馈送到 LSTM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试找出适合我的模型的正确语法.这是一个时间序列预测问题,我想在将其提供给 LSTM 之前使用几个密集层来改进时间序列的表示.

I'm trying to figure out the proper syntax for the model I'm trying to fit. It's a time-series prediction problem, and I want to use a few dense layers to improve the representation of the time series before I feed it to the LSTM.

这是我正在使用的虚拟系列:

Here's a dummy series that I'm working with:

import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import numpy as np
import keras as K
import tensorflow as tf

d = pd.DataFrame(data = {"x": np.linspace(0, 100, 1000)})
d['l1_x'] = d.x.shift(1)
d['l2_x'] = d.x.shift(2)
d.fillna(0, inplace = True)
d["y"] = np.sin(.1*d.x*np.sin(d.l1_x))*np.sin(d.l2_x)
plt.plot(d.x, d.y)

首先,我将拟合一个前面没有密集层的 LSTM.这需要我重塑数据:

First, I'll fit a LSTM with no dense layers preceeding it. This requires that I reshape the data:

X = d[["x", "l1_x", "l2_x"]].values.reshape(len(d), 3,1)
y = d.y.values

这是正确的吗?

教程使它看起来像单个时间序列在第一维中应该有 1,然后是时间步长数 (1000),然后是协变量数 (3).但是当我这样做时,模型无法编译.

The tutorials make it seem like a single time series should have 1 in the first dimension, followed by the number of time steps (1000), followed by the number of covariates (3). But when I do that the model doesn't compile.

这里我编译和训练模型:

Here I compile and train the model:

model = K.Sequential()
model.add(K.layers.LSTM(10, input_shape=(X.shape[1], X.shape[2]), batch_size = 1, stateful=True))
model.add(K.layers.Dense(1))
callbacks = [K.callbacks.EarlyStopping(monitor='loss', min_delta=0, patience=5, verbose=1, mode='auto', baseline=None, restore_best_weights=True)]
model.compile(loss='mean_squared_error', optimizer='rmsprop')

model.fit(X, y, epochs=50, batch_size=1, verbose=1, shuffle=False, callbacks = callbacks)
model.reset_states()

yhat = model.predict(X, 1)
plt.clf()
plt.plot(d.x, d.y)
plt.plot(d.x, yhat)

为什么我不能让模型过拟合??是不是因为我重新塑造了我的数据错误?当我在 LSTM 中使用更多节点时,它并没有真正变得更加过度.

How come I can't get the model to overfit?? Is it because I've reshaped my data wrong? It doesn't really get more over-fittey when I use more nodes in the LSTM.

(我也不清楚有状态"是什么意思.神经网络只是非线性模型.状态"指的是哪些参数,为什么要重置它们?)

(I'm also not clear on what it means to be "stateful". Neural networks are just nonlinear models. Which parameters are the "states" referring to and why would one want to reset them?)

如何在输入和 LSTM 之间插入密集层?最后,我想添加一堆密集层,在 x 到达 LSTM 之前基本上对它进行基础扩展.但是 LSTM 需要一个 3D 数组,一个密集层会输出一个矩阵.我在这里做什么?这不起作用:

How do I interpose dense layers between the input and the LSTM? Finally, I'd like to add a bunch of dense layers, to basically do a basis expansion on x before it gets to the LSTM. But an LSTM wants a 3D array and a dense layer spits out a matrix. What do I do here? This doesn't work:

model = K.Sequential()
model.add(K.layers.Dense(10, activation = "relu", input_dim = 3))
model.add(K.layers.LSTM(3, input_shape=(10, X.shape[2]), batch_size = 1, stateful=True))
model.add(K.layers.Dense(1))

ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2

推荐答案

对于第一个问题,我正在做同样的事情,我没有遇到任何错误,请分享您的错误.

For first question, i am doing same thing, i didn't get any error, please share your error.

注意:我会给你使用函数式 API 的例子,这给了你更多的自由(个人意见)

Note: I will give you example using functional API, which gives little more freedom(personal opinion)

from keras.layers import Dense, Flatten, LSTM, Activation
from keras.layers import Dropout, RepeatVector, TimeDistributed
from keras import Input, Model

seq_length = 15
input_dims = 10
output_dims = 8
n_hidden = 10
model1_inputs = Input(shape=(seq_length,input_dims,))
model1_outputs = Input(shape=(output_dims,))

net1 = LSTM(n_hidden, return_sequences=True)(model1_inputs)
net1 = LSTM(n_hidden, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1

model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')

## Fit the model
model1.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_11 (InputLayer)        (None, 15, 10)            0         
_________________________________________________________________
lstm_8 (LSTM)                (None, 15, 10)            840       
_________________________________________________________________
lstm_9 (LSTM)                (None, 10)                840       
_________________________________________________________________
dense_9 (Dense)              (None, 8)                 88        
_________________________________________________________________

对于你的第二个问题,有两种方法:

To your second problem, there are two method:

  1. 如果你发送的数据是不做序列的,就是以(batch, input_dims)为dims,那么可以使用这个方法RepeatVector,它通过 n_steps 重复相同的权重,这只不过是 rolling_steps> 在 LSTM 中.
  1. If you are sending data without making sequence, which is of dims as (batch, input_dims), then use can use this method RepeatVector, which repeat the same weights by n_steps, which is nothing but rolling_steps in LSTM.

{

seq_length = 15
input_dims = 16
output_dims = 8
n_hidden = 20
lstm_dims = 10
model1_inputs = Input(shape=(input_dims,))
model1_outputs = Input(shape=(output_dims,))

net1 = Dense(n_hidden)(model1_inputs)
net1 = Dense(n_hidden)(net1)

net1 = RepeatVector(3)(net1)
net1 = LSTM(lstm_dims, return_sequences=True)(net1)
net1 = LSTM(lstm_dims, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1

model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')

## Fit the model
model1.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_13 (InputLayer)        (None, 16)                0         
_________________________________________________________________
dense_13 (Dense)             (None, 20)                340       
_________________________________________________________________
dense_14 (Dense)             (None, 20)                420       
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 3, 20)             0         
_________________________________________________________________
lstm_14 (LSTM)               (None, 3, 10)             1240      
_________________________________________________________________
lstm_15 (LSTM)               (None, 10)                840       
_________________________________________________________________
dense_15 (Dense)             (None, 8)                 88        
=================================================================

  1. 如果您发送的是暗淡序列(seq_len, input_dims),那么您可以TimeDistributed,在整个序列上重复相同权重的密集层.

{

seq_length = 15
input_dims = 10
output_dims = 8
n_hidden = 10
lstm_dims = 6
model1_inputs = Input(shape=(seq_length,input_dims,))
model1_outputs = Input(shape=(output_dims,))

net1 = TimeDistributed(Dense(n_hidden))(model1_inputs)
net1 = LSTM(output_dims, return_sequences=True)(net1)
net1 = LSTM(output_dims, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1

model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')

## Fit the model
model1.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_17 (InputLayer)        (None, 15, 10)            0         
_________________________________________________________________
time_distributed_3 (TimeDist (None, 15, 10)            110       
_________________________________________________________________
lstm_18 (LSTM)               (None, 15, 8)             608       
_________________________________________________________________
lstm_19 (LSTM)               (None, 8)                 544       
_________________________________________________________________
dense_19 (Dense)             (None, 8)                 72        
=================================================================

注意:我堆叠了两层,这样做时,在第一层我使用了return_sequence,它在每个时间步返回输出,第二层使用层,它只在最后time_step返回输出.

Note: I stacked two layer, on doing so, in the first layer i used return_sequence, which return the output at each time step, which is used by second layer, where it is return output only at last time_step.

这篇关于为 LSTM 整形数据,并将密集层的输出馈送到 LSTM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆