评估推荐者-无法在x情况下推荐 [英] Evaluating recommenders - unable to recommend in x cases

查看:60
本文介绍了评估推荐者-无法在x情况下推荐的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在更详细地探索Mahout in Action中的一些代码示例.我建立了一个小测试,可以计算应用于我的数据的各种算法的均方根值.

I'm exploring some of the code examples in Mahout in Action in more detail. I have built a small test that computes the RMS of various algorithms applied to my data.

当然,有多个参数会影响RMS,但我不理解运行评估时生成的在某些情况下无法推荐"消息.

Of course, multiple parameters impact the RMS, but I don't understand the "unable to recommend in ... cases" message that is generated while running an evaluation.

看StatsCallable.java,这是在评估者遇到NaN响应时生成的;训练集中或用户的偏好中可能没有足够的数据来提供建议.

Looking at StatsCallable.java, this is generated when an evaluator encounters a NaN response; Perhaps not enough data in the training set or the user's prefs to provide a recommendation.

似乎RMS分数不受大量无法推荐"案例的影响.这个假设正确吗?我是否应该不仅根据RMS评估算法,还应该评估无法推荐"案例与总体培训集的比率?

It seems like the RMS score isn't impacted by a very large set of "unable to recommend" cases. Is that assumption correct? Should I be evaluating my algorithm not only on RMS but also the ratio of "unable to recommend" cases versus my overall training set?

感谢您的反馈.

推荐答案

是的,这实际上意味着根本没有数据可作为估算依据.这通常是数据稀疏的症状.这种情况应该很少见,并且仅在数据量很小或与他人断开连接的用户中发生.

Yes this essentially means there was no data at all on which to base an estimate. That's generally a symptom of data sparseness. It should be rare, and happen only for users with data that's very small or disconnected from others'.

我个人认为这不是什么大问题,除非它是一个非常大的百分比(超过20%?).如果您根本无法为许多用户生成任何记录,我会更加担心.

I personally think it's not such a big deal unless it's a really significant percentage (20%+?) I'd worry more if you couldn't generate any recs at all for many users.

这篇关于评估推荐者-无法在x情况下推荐的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆