在Keras中使用状态LSTM训练多变量多系列回归问题 [英] Training a multi-variate multi-series regression problem with stateful LSTMs in Keras

查看:449
本文介绍了在Keras中使用状态LSTM训练多变量多系列回归问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个P过程的时间序列,每个过程的长度都不同,但是都具有5个变量(维度).我正在尝试预测测试过程的估计寿命.我在Keras中使用有状态LSTM来解决此问题.但是我不确定我的训练过程是否正确.

I have time series of P processes, each of varying length but all having 5 variables (dimensions). I am trying to predict the estimated lifetime of a test process. I am approaching this problem with a stateful LSTM in Keras. But I am not sure if my training process is correct.

我将每个序列分为长度为30的批次.因此,每个序列的形状为(s_i, 30, 5),其中每个P序列(s_i = len(P_i)//30)的s_i不同.我将所有序列附加到形状为(N, 30, 5)其中N = s_1 + s_2 + ... + s_p的训练数据中.

I divide each sequence into batches of length 30. So each sequence is of the shape (s_i, 30, 5), where s_i is different for each of the P sequences (s_i = len(P_i)//30). I append all sequences into my training data which has the shape (N, 30, 5) where N = s_1 + s_2 + ... + s_p.

# design network
model = Sequential()
model.add(LSTM(32, batch_input_shape=(1, train_X[0].shape[1], train_X[0].shape[2]), stateful=True, return_sequences=True))
model.add(LSTM(16, return_sequences=False))
model.add(Dense(1, activation="linear"))
model.compile(loss='mse', optimizer=Adam(lr=0.0005), metrics=['mse'])

model.summary()看起来像

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_1 (LSTM)                (1, 30, 32)               4864      
_________________________________________________________________
lstm_2 (LSTM)                (1, 16)                   3136      
_________________________________________________________________
dense_1 (Dense)              (1, 1)                    17        
=================================================================

训练循环:

for epoch in range(epochs):
    mean_tr_acc = []
    mean_tr_loss = []

    for seq in range(train_X.shape[0]): #24

        # train on whole sequence batch by batch
        for batch in range(train_X[seq].shape[0]): #68
            b_loss, b_acc = model.train_on_batch(np.expand_dims(train_X[seq][batch], axis=0), train_Y[seq][batch][-1])    

            mean_tr_acc.append(b_acc)
            mean_tr_loss.append(b_loss)

        #reset lstm internal states after training of each complete sequence
        model.reset_states()

损失图的问题是我将自定义损失中的值除以太小.如果我删除该除法并以对数方式绘制损耗图,它看起来还不错.

The problem with the loss graph was I was dividing the values in my custom loss, making them too small. If I remove the division and plot the loss graph logarithmically, it looks alright.

一旦训练完成,我正在尝试预测.我给模型展示了一个新过程的30个时间样本.因此输入形状与训练期间的batch_input_shape相同,即(1, 30, 5).对于相同序列的不同批次,我得到的预测都是相同的.

Once the training is done, I am trying to predict. I show my model a 30 time-samples of a new process; so the input shape is same as the batch_input_shape during training i.e. (1, 30, 5). The prediction I am getting for different batches of the same sequence are all same.

我几乎可以肯定我在训练过程中做错了什么.如果有人可以帮助我,将不胜感激.谢谢.

I am almost sure I am doing something wrong in the training process. If anyone could help me out, would be grateful. Thanks.

因此,只有在训练了20个以上的时间后,该模型才能预测出完全相同的结果.否则,预测值将非常接近,但仍会有一些不同.我猜这是由于某种过度拟合造成的.帮助!!!

So the model predicts exactly same results only if it has been trained for more than 20 epochs. Otherwise the prediction values are very close but still a bit different. I guess this is due to some kind of overfitting. Help!!!

25个纪元的损失看起来像这样:

The loss for 25 epochs looks like this:

推荐答案

通常,当结果相同时,是因为您的数据未标准化.我建议您通过简单的正态转换(即(数据-均值)/std)将数据的均值设置为mean = 0和std = 1.在训练和测试之前,请尝试像这样进行转换.训练集和测试集之间的数据标准化方式上的差异也会导致问题,这可能是导致训练与测试损失不符的原因.始终对所有数据使用相同的规范化技术.

Usually when results are the same it's because your data isn't normalized. I suggest you center your data with mean=0 and std=1 with a simple normal transform (ie. (data - mean)/std ). Try transforming it like so before training and testing. Differences in how data is normalized between training and testing sets can also cause problems, which may be the cause of your discrepancy in train vs test loss. Always use the same normalization technique for all your data.

这篇关于在Keras中使用状态LSTM训练多变量多系列回归问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆