排序列表差异 [英] Sorted list difference
问题描述
我有以下问题。
我有一组元素,我可以通过某种算法A来排序。排序是好的,但是非常昂贵。
I have a set of elements that I can sort by a certain algorithm A . The sorting is good, but very expensive.
还有一个算法B可以近似的结果A.它是更快,但排序不会准确同样。
There is also an algorithm B that can approximate the result of A. It is much faster, but the ordering will not be exactly the same.
将A的输出作为一个黄金标准,我需要得到一个有意义的估计,因为使用B对同一数据造成的错误。
Taking the output of A as a 'golden standard' I need to get a meaningful estimate of the error resulting of the use of B on the same data.
任何人都可以建议任何资源,我可以看看来解决我的问题?
提前感谢!
Could anyone please suggest any resource I could look at to solve my problem? Thanks in advance!
编辑:
按要求:添加一个示例来说明案例:
如果数据是字母表的前10个字母,
As requested : adding an example to illustrate the case : if the data are the first 10 letters of the alphabet,
A输出:a,b,c,d,e,f,g,h ,i,j
A outputs : a,b,c,d,e,f,g,h,i,j
B输出:a,b,d,c,e,g,h,f,j,i
B outputs : a,b,d,c,e,g,h,f,j,i
结果错误的可能措施是什么,这将允许我调整算法B的内部参数以获得更接近A?的输出。
What are the possible measures of the resulting error, that would allow me to tune the internal parameters of algorithm B to get result closer to the output of A?
推荐答案
Spearman的rho
我想你想要的是 Spearman's rank correlation coefficient 。使用两个排序的索引[rank]向量(完美 A
和近似 B
),计算秩相关 rho
范围从-1(完全不同)到1(完全相同):
Spearman's rho
I think what you want is Spearman's rank correlation coefficient. Using the index [rank] vectors for the two sortings (perfect A
and approximate B
), you calculate the rank correlation rho
ranging from -1 (completely different) to 1 (exactly the same):
其中d(i) A和B之间每个字符的等级差异
where d(i) are the difference in ranks for each character between A and B
您可以将误差的度量定义为距离 D:=(1-rho) / 2
You can defined your measure of error as a distance D := (1-rho)/2
.
这篇关于排序列表差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!