评估LightFM推荐模型 [英] Evaluating the LightFM Recommendation Model

查看:255
本文介绍了评估LightFM推荐模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在玩弄 lightfm 相当长的一段时间,发现生成它真的很有用建议.但是,我想知道两个主要问题.

  1. 来评估LightFM模型中的情况下建议的等级关系,应该我更多地依赖于或其他提供评价指标,如?与其他指标相比,在什么情况下我应该专注于改善我的precision@k?还是它们高度相关?这意味着如果我设法提高precision@k得分,其他指标也会随之而来,对吗?

  2. 您如何解释使用WARP损失函数训练的模型的precision@5得分是否为0.089?据我所知,在精密5告诉我什么比例最高的5个结果是阳性/相关.这意味着如果我的预测无法达到前5名,我将得到0 precision@5;如果在前5名中只有一个正确的预测,我将得到0.2.但是我无法解释0.0xx对precision@n的含义

谢谢

解决方案

Precision @ K和AUC测量不同的事物,并为您提供有关模型质量的不同观点.在一般情况下,他们应该被相关,但了解它们之间的区别可以帮助你选择一个适合您的应用更重要.

  • Precision @ K衡量正面项目在K个最高排名项目中的比例.因此,它非常关注列表顶部的排名质量:其余排名的好坏无关紧要,只要前K个项目大多为正数即可.这将是一个适当的衡量标准,如果你只是以往任何时候都将显示您的用户非常名列榜首.
  • AUC衡量整体排名的质量.在二进制情况下,可以将其解释为随机选择的肯定项目比随机选择的否定项目排名更高的概率.因此,一个接近1.0的AUC总体上表明您的订购是正确的:即使前K个项目都不是正数,这也可能是正确的.如果你不会对这些结果将提交给用户完全控制这个指标可能更合适;它可能是前K推荐的项目不提供任何更多(比方说,他们脱销),你需要进一步下移的排名.再高的AUC得分会给你的信心,你的排名是在整个高品质.

还请注意,虽然AUC度量标准的最大值为1.0,但可达到的最大精度@K取决于您的数据.例如,如果您测量precision @ 5但只有一个正项,则您可以达到的最高得分为0.2.

在LightFM中,AUC和精度@ķ例程返回度量得分的阵列:一个用于在测试数据的每个用户.最有可能的,你平均这些得到的平均AUC或平均精度@ķ得分:如果某些用户对精确度得分0 @ 5度,有可能是你的平均准确率@ 5将是0到0.2之间<. /p>

希望这会有所帮助!

I've been playing around with lightfm for quite some time and found it really useful to generate recommendations. However, there are two main questions that I would like to know.

  1. to evaluate the LightFM model in case where the rank of the recommendations matter, should I rely more on precision@k or other provided evaluation metrics such as AUC score? in what cases should I focus on improving my precision@k compared to other metrics? or maybe are they highly correlated? which means if I manage to improve my precision@k score, the other metrics would follow, am I correct?

  2. how would you interpret if a model that trained using WARP loss function has a score 0.089 for precision@5 ? AFAIK, Precision at 5 tells me what proportion of the top 5 results are positives/relevant. which means I would get 0 precision@5 if my predictions could not make it to top 5 or I will get 0.2 if I got only one predictions correct in the top 5. But I cannot interpret what 0.0xx means for precision@n

Thanks

解决方案

Precision@K and AUC measure different things, and give you different perspectives on the quality of your model. In general, they should be correlated, but understanding how they differ may help you choose the one that is more important for your application.

  • Precision@K measures the proportion of positive items among the K highest-ranked items. As such, it's very focused on the ranking quality at the top of the list: it doesn't matter how good or bad the rest of your ranking is as long as the first K items are mostly positive. This would be an appropriate metric if you are only ever going to be showing your users the very top of the list.
  • AUC measures the quality of the overall ranking. In the binary case, it can be interpreted as the probability that a randomly chosen positive item is ranked higher than a randomly chosen negative item. Consequently, an AUC close to 1.0 will suggest that, by and large, your ordering is correct: and this can be true even if none of the first K items are positives. This metric may be more appropriate if you do not exert full control on which results will be presented to the user; it may be that the first K recommended items are not available any more (say, they are out of stock), and you need to move further down the ranking. A high AUC score will then give you confidence that your ranking is of high quality throughout.

Note also that while the maximum value of the AUC metric is 1.0, the maximum achievable precision@K is dependent on your data. For example, if you measure precision@5 but there is only one positive item, the maximum score you can achieve is 0.2.

In LightFM, the AUC and precision@K routines return arrays of metric scores: one for every user in your test data. Most likely, you average these to get a mean AUC or mean precision@K score: if some of your users have score 0 on the precision@5 metric, it is possible that your average precision@5 will be between 0 and 0.2.

Hope this helps!

这篇关于评估LightFM推荐模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆