在运行itertools函数后对python字典排序 [英] Sorting a python dictionary after running an itertools function

查看：163 发布时间：2016/12/22 0:02:13 python sorting dictionary comparison itertools

本文介绍了在运行itertools函数后对python字典排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这个问题是由两个代码指导的两个代码的高潮在这里SO。第一个问题是如何比较两个字符串之间的相似性，我得到了一个很好的答案，如看到此处使用以下代码：

This question is the culmination of two pieces of code guided by two answers here on SO. The first question I had was how to compare similarity between two strings and I got a good answer as seen here with the following code:

代码1

def get_bigrams(string):
    '''
    Takes a string and returns a list of bigrams
    '''
    s = string.lower()
    return { s[i:i+2] for i in range(len(s) - 1) }

def string_similarity(str1, str2):
    '''
    Perform bigram comparison between two strings
    and return a percentage match in decimal form
    '''
    pairs1 = get_bigrams(str1)
    pairs2 = get_bigrams(str2)
    intersection = set(pairs1) & set(pairs2)
    return (2.0 * len(intersection)) / (len(pairs1) + len(pairs2))

之后，我需要一种方法来排序名称列表，以便通过上述代码运行它们。我得到了代码此处，如下所示：

After that I needed a way to sort the list of names for me to run them through the above code. I got the code here as seen below:

code 2

code 2

import itertools persons = ["Peter parker", "Richard Parker", "Parker Richard", "Aunt May"] similarity = [] for p1, p2 in itertools.combinations(persons, 2): similarity.append(string_similarity(p1,p2)) print("%s - %s: " %(p1, p2) + " " + str(string_similarity(p1, p2))) similarity = sorted(similarity, key=float) print(similarity)

现在，最后的障碍是我的数据不是在列表中，实际上是从主键获取数据库，这是我最终想要跟踪。意思是当我比较多个名字，我需要标记，例如。 ID 1和ID 2是最大的变体。为了确定这两个ID是最不同的，我需要排序上面的'code1`的结果，如下所示：

Now, the final hurdle is that my data is not in a list and is actually fetched from a database with primary keys which is what I ultimately want to track. Meaning when I compare multiple names, I need to mark that e.g. ID 1 and ID 2 are the most variant. For me to determine that those two IDs are the most variant, I need to sort the result of 'code1` above which looks like below:

Peter parker - Richard Parker: 0.5454545454545454 Peter parker - Parker Richard: 0.5454545454545454 Peter parker - Aunt May: 0.0 Richard Parker - Parker Richard: 0.8333333333333334 Richard Parker - Aunt May: 0.0 Parker Richard - Aunt May: 0.0 [0.0, 0.0, 0.0, 0.5454545454545454, 0.5454545454545454, 0.8333333333333334]

在我的头中，而不是那些名字，我需要的主要ID与名称被提取，所以我在想使用字典。有没有办法使用 code2 运行{PID：Name}，{PID1：Name1}，PID2：Name2}的字典，使用 code1 ，对结果排序，然后知道具有最高相似性的名称是PID1和PID3？

In my head instead of those names there I need the Primary IDs with which the names were fetched with so am thinking using a dictionary. Is there a way to run a dictionary of {PID:Name}, {PID1:Name1}, PID2:Name2} using code2, get the similarity value using code1, sort the result and then know that names with the highest similarity are PID1 and PID3? Or is there a more elegant and less hair pulling way than am currently thinking...

推荐答案

是的，你需要将对（ID，名称）。为此，你可以使用dict，一个元组甚至一个类。例如使用元组，您的代码2 将更改为：

Yes, you need to associate the pair (ID, name). For this you can use a dict, a tuple or even a class. For example using tuples your code 2 would change to:

persons = [('id1', "Peter parker"), ('id2' ,"Richard Parker"), ('id3' ,"Parker Richard"), ('id4' ,"Aunt May")] similarity = [[p1, p2, string_similarity(p1[1], p2[1])] for p1, p2 in itertools.combinations(persons, 2)] similarity = sorted(similarity, key=lambda x: x[2], reverse=True) for p1, p2, sim in similarity: print "{} - {}: {}".format(p1, p2, sim) # p1[0], p2[0] to show ids only

您将获得：

('id2', 'Richard Parker') - ('id3', 'Parker Richard'): 0.833333333333 ('id1', 'Peter parker') - ('id2', 'Richard Parker'): 0.545454545455 ('id1', 'Peter parker') - ('id3', 'Parker Richard'): 0.545454545455 ('id1', 'Peter parker') - ('id4', 'Aunt May'): 0.0 ('id2', 'Richard Parker') - ('id4', 'Aunt May'): 0.0 ('id3', 'Parker Richard') - ('id4', 'Aunt May'): 0.0

这篇关于在运行itertools函数后对python字典排序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在运行itertools函数后对python字典排序 [英] Sorting a python dictionary after running an itertools function

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在运行itertools函数后对python字典排序 [英] Sorting a python dictionary after running an itertools function

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭