要在神经网络模型中用于预测的数据的缺失值 [英] Missing values for the data to be used in a Neural Network model for prediction

查看:57
本文介绍了要在神经网络模型中用于预测的数据的缺失值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前有大量数据将用于训练预测神经网络(美国主要机场的千兆字节天气数据).我几乎每天都有数据,但有些机场的数据中缺少值.例如,一个机场在 1995 年之前可能不存在,所以我没有在那之前那个特定位置的数据.此外,有些缺少整年(可能跨越 1990 年到 2011 年,缺少 2003 年).

I currently have a lot of data that will be used to train a prediction neural network (gigabytes of weather data for major airports around the US). I have data for almost every day, but some airports have missing values in their data. For example, an airport might not have existed before 1995, so I have no data before then for that specific location. Also, some are missing whole years (one might span from 1990 to 2011, missing 2003).

如何在不误导神经网络的情况下使用这些缺失值进行训练?我想用 0 或 -1 填充空数据,但我觉得这会导致网络预测某些输出的这些值.

What can I do to train with these missing values without misguiding my neural network? I though about filling the empty data with 0s or -1s, but I feel like this would cause the network to predict these values for some outputs.

推荐答案

我使用很多 NN 进行预测,我可以说你可以简单地在数据中留下那些漏洞".事实上,NN 能够学习观察到的数据中的关系,所以如果你没有一个特定的时期,那没关系......如果你将空数据设置为一个常数值,你会给你的训练算法误导信息.NN 不需要连续"数据,事实上,在训练之前对数据集进行混洗是一种很好的做法,以便对不连续的样本进行反向传播阶段......

I'm using a lot NNs for forecasting and I can say you that you can simply leave that "holes" in your data. In fact, NNs are able to learn relationships inside observed data and so if you don't have a specific period it doesn't matter...if you set empty data as a constant value you will have give to your training algorithm misleading information. NNs don't need "continuous" data, in fact it's a good practise to shuffle the data sets before training in order to do the backpropagation phase on not-contiguous samples...

这篇关于要在神经网络模型中用于预测的数据的缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆