计算两个列表相似 [英] Calculating the similarity of two lists

查看:147
本文介绍了计算两个列表相似的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个列表:

如。 A = [1,8,3,9,4,9,3,8,1,2,3] 和 B = [1,8,1,3,9,4,9,3,8,1,2,3]

eg. a = [1,8,3,9,4,9,3,8,1,2,3] and b = [1,8,1,3,9,4,9,3,8,1,2,3]

这两种含有整数。还有就是整数背后没有含义(如1不是'接近'到3比是8)。

Both contain ints. There is no meaning behind the ints (eg. 1 is not 'closer' to 3 than it is to 8).

我想设计一个算法来计算两个有序列表之间的相似性。有序是关键字在这里(所以我不能只取一组两个列表,并计算它们的set_difference百分比)。有时候,数字并重复(如3,8和9以上,我不能忽视的重复)。

I'm trying to devise an algorithm to calculate the similarity between two ORDERED lists. Ordered is keyword right here (so I can't just take the set of both lists and calculate their set_difference percentage). Sometimes numbers do repeat (for example 3, 8, and 9 above, and I cannot ignore the repeats).

在上面的例子中,函数我会打电话告诉我说a和b是〜90%,类似的例子。我该怎么办呢?编辑距离为一些东西,浮现在脑海。我知道如何使用它的字符串,但我不知道如何使用它与整数列表。谢谢!

In the example above, the function I would call would tell me that a and b are ~90% similar for example. How can I do that? Edit distance was something which came to mind. I know how to use it with strings but I'm not sure how to use it with a list of ints. Thanks!

推荐答案

您可以使用 difflib 模块

比()
  回报范围内的浮动序列相似度的度量[0,1]。

ratio()
Return a measure of the sequences’ similarity as a float in the range [0, 1].

其中给出:

 >>> s1=[1,8,3,9,4,9,3,8,1,2,3]
 >>> s2=[1,8,1,3,9,4,9,3,8,1,2,3]
 >>> sm=difflib.SequenceMatcher(None,s1,s2)
 >>> sm.ratio()
 0.9565217391304348

这篇关于计算两个列表相似的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆