两本字典相交 [英] Intersecting two dictionaries

查看:29
本文介绍了两本字典相交的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一个关于倒排索引的搜索程序.索引本身是一个字典,其键是术语,其值本身就是短文档的字典,以 ID 号作为键,将它们的文本内容作为值.

I am working on a search program over an inverted index. The index itself is a dictionary whose keys are terms and whose values are themselves dictionaries of short documents, with ID numbers as keys and their text content as values.

要对两个术语执行AND"搜索,因此我需要将它们的发布列表(字典)相交.在 Python 中执行此操作的明确(不一定过于聪明)的方法是什么?我开始用 iter 长期尝试:

To perform an 'AND' search for two terms, I thus need to intersect their postings lists (dictionaries). What is a clear (not necessarily overly clever) way to do this in Python? I started out by trying it the long way with iter:

p1 = index[term1]  
p2 = index[term2]
i1 = iter(p1)
i2 = iter(p2)
while ...  # not sure of the 'iter != end 'syntax in this case
...

推荐答案

一般来说,在Python中构造字典的交集,可以先使用& 运算符 计算字典键集合的交集(字典键是 Python 3 中类似集合的对象:

In general, to construct the intersection of dictionaries in Python, you can first use the & operator to calculate the intersection of sets of the dictionary keys (dictionary keys are set-like objects in Python 3):

dict_a = {"a": 1, "b": 2}
dict_b = {"a": 2, "c": 3} 

intersection = dict_a.keys() & dict_b.keys()  # {'a'}

在 Python 2 上,您必须自己将字典键转换为集合:

On Python 2 you have to convert the dictionary keys to sets yourself:

keys_a = set(dict_a.keys())
keys_b = set(dict_b.keys())
intersection = keys_a & keys_b

然后给定键的交集,然后您可以根据需要构建值的交集.您必须在此处做出选择,因为如果相关值不同,集合交集的概念不会告诉您该怎么做.(这大概就是为什么 & 交集运算符没有直接为 Python 中的字典定义的原因).

Then given the intersection of the keys, you can then build the intersection of your values however is desired. You have to make a choice here, since the concept of set intersection doesn't tell you what to do if the associated values differ. (This is presumably why the & intersection operator is not defined directly for dictionaries in Python).

在这种情况下,听起来您对同一个键的值是相等的,因此您只需从字典之一中选择值:

In this case it sounds like your values for the same key would be equal, so you can just choose the value from one of the dictionaries:

dict_of_dicts_a = {"a": {"x":1}, "b": {"y":3}}
dict_of_dicts_b = {"a": {"x":1}, "c": {"z":4}} 

shared_keys = dict_of_dicts_a.keys() & dict_of_dicts_b.keys()

# values equal so choose values from a:
dict_intersection = {k: dict_of_dicts_a[k] for k in shared_keys }  # {"a":{"x":1}}

组合值的其他合理方法将取决于字典中值的类型及其代表的内容.例如,您可能还需要字典的字典共享键的值的联合.由于字典的并集不依赖于值,因此定义良好,在 python 中,您可以使用 | 运算符获取它:

Other reasonable methods of combining values would depend on the types of the values in your dictionaries, and what they represent. For example you might also want the union of values for shared keys of dictionaries of dictionaries. Since the union of dictionaries doesn't depend on the values, it is well defined, and in python you can get it using the | operator:

# union of values for each key in the intersection:
dict_intersection_2 = { k: dict_of_dicts_a[k] | dict_of_dicts_b[k] for k in shared_keys }

在这种情况下,两个键a"的字典值相同,结果相同.

Which in this case, with identical dictionary values for key "a" in both, would be the same result.

这篇关于两本字典相交的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆