具有缺失值的多元 LSTM [英] Multivariate LSTM with missing values

查看:52
本文介绍了具有缺失值的多元 LSTM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 LSTM 处理时间序列预测问题.输入包含几个特征,所以我使用的是多元 LSTM.问题是有一些缺失值,例如:

I am working on a Time Series Forecasting problem using LSTM. The input contains several features, so I am using a Multivariate LSTM. The problem is that there are some missing values, for example:

    Feature 1     Feature 2  ...  Feature n
 1    2               4             nan
 2    5               8             10
 3    8               8              5
 4    nan             7              7
 5    6              nan            12

不是插入缺失值,这会在结果中引入偏差,因为有时在同一特征上有很多连续的时间戳具有缺失值,我想知道是否有办法让LSTM学习使用缺失值,例如,使用遮罩层或类似的东西?有人可以向我解释什么是处理这个问题的最佳方法吗?我正在使用 Tensorflow 和 Keras.

Instead of interpolating the missing values, that can introduce bias in the results, because sometimes there are a lot of consecutive timestamps with missing values on the same feature, I would like to know if there is a way to let the LSTM learn with the missing values, for example, using a masking layer or something like that? Can someone explain to me what will be the best approach to deal with this problem? I am using Tensorflow and Keras.

推荐答案

正如 François Chollet(Keras 的创造者)在 他的书,处理缺失值的一种方法是用零替换它们:

As suggested by François Chollet (creator of Keras) in his book, one way to handle missing values is to replace them with zero:

一般来说,使用神经网络,输入缺失值是安全的0,条件是 0 还不是一个有意义的值.这网络将从暴露于值 0 表示的数据中学习丢失数据并将开始忽略该值.请注意,如果您是期望测试数据中存在缺失值,但网络已经过训练在没有任何缺失值的数据上,网络不会学会忽略缺失值!在这种情况下,你应该人为地生成缺少条目的训练样本:复制一些训练多次采样,并删除您期望的一些功能可能会在测试数据中丢失.

In general, with neural networks, it’s safe to input missing values as 0, with the condition that 0 isn’t already a meaningful value. The network will learn from exposure to the data that the value 0 means missing data and will start ignoring the value. Note that if you’re expecting missing values in the test data, but the network was trained on data without any missing values, the network won’t have learned to ignore missing values! In this situation, you should artificially generate training samples with missing entries: copy some training samples several times, and drop some of the features that you expect are likely to be missing in the test data.

因此您可以将零分配给 NaN 元素,考虑到您的数据中未使用零(您可以将数据标准化为一个范围,例如 [1,2],然后将零分配给NaN 元素;或者,您可以将所有值标准化为 [0,1] 范围内的所有值,然后使用 -1 而不是零来替换 NaN 元素.)

So you can assign zero to NaN elements, considering that zero is not used in your data (you can normalize the data to a range, say [1,2], and then assign zero to NaN elements; or alternatively, you can normalize all the values to be in range [0,1] and then use -1 instead of zero to replace NaN elements.)

另一种替代方法是使用 Masking 层凯拉斯.你给它一个掩码值,比如 0,它会删除任何时间步长(即行),其中所有的特征都等于掩码值.但是,以下所有层都应支持掩码,您还需要对数据进行预处理,并将掩码值分配给包含一个或多个 NaN 特征的时间步长的所有特征.来自 Keras 文档的示例:

Another alternative way is to use a Masking layer in Keras. You give it a mask value, say 0, and it would drop any timestep (i.e. row) where all its features are equal to the mask value. However, all the following layers should support masking and you also need to pre-process your data and assign the mask value to all the features of a timestep which includes one or more NaN features. Example from Keras doc:

考虑一个形状为 (samples, timesteps,features) 的 Numpy 数据数组 x,被馈送到 LSTM 层.你想屏蔽时间步 #3#5 因为你缺乏这些时间步长的数据.您可以:

Consider a Numpy data array x of shape (samples, timesteps,features), to be fed to an LSTM layer. You want to mask timestep #3 and #5 because you lack data for these timesteps. You can:

  • 设置 x[:, 3, :] = 0.x[:, 5, :] = 0.

LSTM 层之前插入一个带有 mask_value=0. 的 Masking 层:

insert a Masking layer with mask_value=0. before the LSTM layer:

model = Sequential()
model.add(Masking(mask_value=0., input_shape=(timesteps, features)))
model.add(LSTM(32))


更新(2021 年 5 月):根据 François Cholle 的更新建议,最好使用更有意义或信息量更大的值(而不是使用零)来掩盖缺失值.该值可以通过计算(例如平均值、中位数等)或根据数据本身进行预测.


Update (May 2021): According to an updated suggestion from François Cholle, it might be better to use a more meaningful or informative value (instead of using zero) for masking missing values. This value could be computed (e.g. mean, median, etc.) or predicted from the data itself.

这篇关于具有缺失值的多元 LSTM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆