Python-使用带有Keras的LSTM递归神经网络进行模式预测 [英] Python - Pattern prediction using LSTM Recurrent Neural Networks with Keras

查看:86
本文介绍了Python-使用带有Keras的LSTM递归神经网络进行模式预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理来自格式为

I am dealing with pattern prediction from a formatted CSV dataset with three columns (time_stamp, X and Y - where Y is the actual value). I wanted to predict the value of X from Y based on time index from past values and here is how I approached the problem with LSTM Recurrent Neural Networks in Python with Keras.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import LSTM, Dense
from keras.preprocessing.sequence import TimeseriesGenerator
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split

np.random.seed(7)

df = pd.read_csv('test32_C_data.csv')
n_features=100

values = df.values

for i in range(0,n_features):
    df['X_t'+str(i)] = df['X'].shift(i) 
    df['X_tp'+str(i)] = (df['X'].shift(i) - df['X'].shift(i+1))/(df['X'].shift(i))

print(df)
pd.set_option('use_inf_as_null', True)

#df.replace([np.inf, -np.inf], np.nan).dropna(axis=1)
df.dropna(inplace=True)

X = df.drop('Y', axis=1)
y = df['Y']


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.40)

X_train = X_train.drop('time', axis=1)
X_train = X_train.drop('X_t1', axis=1)
X_train = X_train.drop('X_t2', axis=1)
X_test = X_test.drop('time', axis=1)
X_test = X_test.drop('X_t1', axis=1)
X_test = X_test.drop('X_t2', axis=1)


sc = MinMaxScaler()

X_train = np.array(df['X'])
X_train = X_train.reshape(-1, 1)
X_train = sc.fit_transform(X_train)

y_train = np.array(df['Y'])
y_train=y_train.reshape(-1, 1)
y_train = sc.fit_transform(y_train)


model_data = TimeseriesGenerator(X_train, y_train, 100, batch_size = 10) 

# Initialising the RNN
model = Sequential() 

# Adding the input layerand the LSTM layer
model.add(LSTM(4, input_shape=(None, 1)))

# Adding the output layer
model.add(Dense(1)) 

# Compiling the RNN
model.compile(loss='mse', optimizer='rmsprop') 

# Fitting the RNN to the Training set
model.fit_generator(model_data)

# evaluate the model
#scores = model.evaluate(X_train, y_train)
#print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))


# Getting the predicted values
predicted = X_test
predicted = sc.transform(predicted)
predicted = predicted.reshape((-1, 1, 1))
y_pred = model.predict(predicted)
y_pred = sc.inverse_transform(y_pred)

当我这样绘制预测时

plt.figure
plt.plot(y_test, color = 'red', label = 'Actual')
plt.plot(y_pred, color = 'blue', label = 'Predicted')
plt.title('Prediction')
plt.xlabel('Time [INdex]')
plt.ylabel('Values')
plt.legend()
plt.show()

以下是我得到的情节.

The following plot is what I get.

但是,如果我们分别绘制每列,

However, if we plot each column separately,

groups = [1, 2]
i = 1
# plot each column
plt.figure()
for group in groups:
    plt.subplot(len(groups), 1, i)
    plt.plot(values[:, group])
    plt.title(df.columns[group], y=0.5, loc='right')
    i += 1
plt.show()

以下是我们得到的图.

我们如何提高预测准确性?

How can we improve the prediction accuracy?

推荐答案

我会让你从这里拿走它,但这至少应该可以帮助您.

I'll let you take it from here, but this should at least get you going.

注意:我看到您所预测的变量有些混乱.为此,我正在预测通常是标准的"Y".如果那是不正确的,只需在放入create_sequences函数之前交换订单即可.该代码仍然可以正常工作,无论如何,这只是一个起点,您需要多花一点时间才能获得一个性能良好的网络.

Note: I see there is some confusion around what variable you are predicting. For this I was predicting the 'Y' which is usually standard. If that's incorrect, just swap the order prior to putting into the create_sequences function. The code should still work, and this is just a starting point for you anyways, you'll need to play around with it quite a bit more to get a good performing network.

import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Dense
from sklearn.preprocessing import MinMaxScaler
import pandas as pd

np.random.seed(7)

df = pd.read_csv('test32_C_data.csv')
n_features = 100


def create_sequences(data, window=14, step=1, prediction_distance=14):
    x = []
    y = []

    for i in range(0, len(data) - window - prediction_distance, step):
        x.append(data[i:i + window])
        y.append(data[i + window + prediction_distance][0])

    x, y = np.asarray(x), np.asarray(y)

    return x, y


# Scaling prior to splitting
scaler = MinMaxScaler(feature_range=(0.01, 0.99))
scaled_data = scaler.fit_transform(df.loc[:, ["Y", "X"]].values)

# Build sequences
x_sequence, y_sequence = create_sequences(scaled_data)

# Create test/train split
test_len = int(len(x_sequence) * 0.15)
valid_len = int(len(x_sequence) * 0.15)
train_end = len(x_sequence) - (test_len + valid_len)
x_train, y_train = x_sequence[:train_end], y_sequence[:train_end]
x_valid, y_valid = x_sequence[train_end:train_end + valid_len], y_sequence[train_end:train_end + valid_len]
x_test, y_test = x_sequence[train_end + valid_len:], y_sequence[train_end + valid_len:]

# Initialising the RNN
model = Sequential()

# Adding the input layerand the LSTM layer
model.add(LSTM(4, input_shape=(14, 2)))

# Adding the output layer
model.add(Dense(1))

# Compiling the RNN
model.compile(loss='mse', optimizer='rmsprop')

# Fitting the RNN to the Training set
model.fit(x_train, y_train, epochs=5)

# Getting the predicted values
y_pred = model.predict(x_test)

# Plot results
pd.DataFrame({"y_test": y_test, "y_pred": np.squeeze(y_pred)}).plot()

差异:

  • 使用自定义序列生成器,14步窗口,14步前瞻"进行预测

  • Using custom sequence generator, 14 step window, 14 step "look-forward" for prediction

自定义训练/测试/有效分组,您可以在训练中使用验证集来提前停止

Custom train/test/valid split, you can use the validation set when training for early stopping

将输入形状更改为包括2个要素,其窗口为14 input_shape =(14,2)

Changed the input shape to include 2 features with a window of 14 input_shape=(14,2)

5个纪元

这篇关于Python-使用带有Keras的LSTM递归神经网络进行模式预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆