在神经网络模型中用于预测的数据的缺失值 [英] Missing values for the data to be used in a Neural Network model for prediction

查看:1021
本文介绍了在神经网络模型中用于预测的数据的缺失值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前有很多数据将用于训练一个预测神经网络(美国各主要机场的千兆字节天气数据)。我几乎每天都有数据,但有些机场的数据缺少值。例如,一个机场可能在1995年之前可能不存在,所以我之前在这个具体位置之前没有数据。另外,有些缺少整年(一个可能从1990年到2011年,2003年失踪)。



如果不误导我的神经网络,我可以用这些缺失的值来训练?我虽然用0或-1填充空数据,但我觉得这样会导致网络预测这些值的一些输出。

解决方案

我正在使用很多NN进行预测,我可以说你可以简单地将数据中的hole留下。事实上,NNs能够学习观察数据中的关系,所以如果你没有特定的时期,这并不重要...如果你将空数据设置为一个恒定值,你将会给你的训练算法带来误导性的信息。 NNs不需要连续数据,实际上在训练之前洗牌数据集是一个很好的做法,以便对不连续的样本执行反向传播阶段。


I currently have a lot of data that will be used to train a prediction neural network (gigabytes of weather data for major airports around the US). I have data for almost every day, but some airports have missing values in their data. For example, an airport might not have existed before 1995, so I have no data before then for that specific location. Also, some are missing whole years (one might span from 1990 to 2011, missing 2003).

What can I do to train with these missing values without misguiding my neural network? I though about filling the empty data with 0s or -1s, but I feel like this would cause the network to predict these values for some outputs.

解决方案

I'm using a lot NNs for forecasting and I can say you that you can simply leave that "holes" in your data. In fact, NNs are able to learn relationships inside observed data and so if you don't have a specific period it doesn't matter...if you set empty data as a constant value you will have give to your training algorithm misleading information. NNs don't need "continuous" data, in fact it's a good practise to shuffle the data sets before training in order to do the backpropagation phase on not-contiguous samples...

这篇关于在神经网络模型中用于预测的数据的缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆