如何比较排名列表 [英] How to compare ranked lists

查看:84
本文介绍了如何比较排名列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个已排序项目列表.每个项目都有一个等级和一个相关的分数. 比分决定了排名.这两个列表可以包含(并且通常包含)不同的项目,即它们的交集可以为空.我需要采取措施比较这样的排名. 是否有众所周知的算法(在文学或现实世界的系统中)做到这一点? 距离的度量应考虑到分数以及项目的等级.

I have two lists of ranked items. Each item has an rank and an associated score. The score has decided the rank. The two lists can contains (and usually do) different items, that is their intersection can be empty. I need measures to compare such rankings. Are there well-known algorithms (in literature or real-world systems) to do so ? The measure of distance should take into account the scores as well as the ranks of the items.

推荐答案

这个问题从未得到解答,但我仍然认为对于许多人来说很重要:

This question has never been answered before, but I still think it's important to a lot of people out there:

您的两个要求,即列表的非结合性排名的重要性,不能通过常见的相关性测试来满足.除了其中的大多数(例如,Kendall-Tau),没有考虑顺序:

Your two requirements, i.e. non-conjointness of lists and importance of ranks are not met by common correlation tests. In addition to that most of them (Kendall-Tau for example) do not take the order into account:

>>> from scipy.stats import kendalltau
>>> kendalltau([1,2,3,4,5], [2,1,3,4,5])
KendalltauResult(correlation=0.79999999999999982, value=0.050043527347496564)
>>> kendalltau([1,2,3,4,5], [1,2,3,5,4])
KendalltauResult(correlation=0.79999999999999982, value=0.050043527347496564)

第一次比较应该产生比第二次要小的得多的值,因为列表的开头比结尾的(第二个要求)更重要.

The 1st comparison should yield a significantly smaller value than the 2nd one, because the head of the list is more important than the tail (2nd requirement).

除此以外,还可以看到两个列表的大小和元素类型必须相同(第一个要求)

In addition to that one can see that both lists need to be the same size and have the same kind of elements (1st requirement)

可能的解决方案:

可以满足您所有需求的衡量标准称为排名优先的重叠.这是所谓的基于平均值的重叠的概括,此博客. 同一个人还发布了RBO的实现.

The measure that satisfies all your needs is called Rank Biased Overlap. It's a generalization of the so called average based overlap, which is wonderfully illustrated in this blog. The same guy also put out an implementation of RBO.

2018年1月更新:

  • 针对python 3.5.2的 RBO 的另一种实现方式
  • Another implementation of RBO for python 3.5.2

这篇关于如何比较排名列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆