结合数据时的结果 - 分离时准确 [英] Results inacurate when data is combined - accurate when separated

查看:67
本文介绍了结合数据时的结果 - 分离时准确的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在训练一个回归模型来预测销售额。我有3年和5个州的培训数据。

I am training a regression model to predict sales. I have training data for 3 years and 5 states.

对于特定产品,4个州的实际销售额一直很高,而对于1个州,同一产品的实际销售额总是非常低 -   -   ;这一直表示在我的训练数据中。

For a specific product, actuals sales is always high for 4 states and for 1 state, actual sales is always very low for this same product  - this is consistently represented in my training data.

我用历史数据和未来数据测试模型。一个州的预测(得分标签),我预计销售额非常低,非常不准确(太高)。

I test the model with both historical and future data. The predictions (score labels) for the one state, where I expect sales to be very low, are very inaccurate (much too high).

然而,当我创建一个新模型时,只有一个州的数据,并用来自一个州的数据进行测试 - 结果非常准确。

However, when I create a new model, with only data for the one state, and test it with data from the one state - the results are very accurate.

为什么模型会组合数据为一个州提供了这种不准确的预测?如果我创建一个新模型,为什么我会得到准确的结果?并且只使用来自这个州的数据?注意:对于这两个模型,我使用完全相同的训练
数据 - 除了模型1包含所有状态的数据,模型2仅包含1个状态的数据。

Why would the model with the combined data provide such inaccurate predictions for the one state? Why do I get accurate results if I create a new model, and only use data from the one state? Note: for both models I used the exact same training data - except that model 1 contains data for all states and model 2 only contains data for 1 state.

 

推荐答案

你好,

基于您的问题的描述我认为这可能是您的实验所用算法的问题。我们建议查看这个

备忘单

选择算法
的建议  ;然后再试一次。选择训练数据集的神经网络回归模型可能是最合适的。

Based on the description of your issue I think it may be an issue with the algorithm used for your experiment. We would recommend to check this cheat sheet and recommendations on choosing algorithms and try again. It might be a best fit to choose Neural Network regression model for your training dataset.

-Rohit


这篇关于结合数据时的结果 - 分离时准确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆