在两个大型词典中查找匹配键并快速完成 [英] Finding matching keys in two large dictionaries and doing it fast

查看:92
本文介绍了在两个大型词典中查找匹配键并快速完成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在两个不同的词典中找到对应的键.每个条目约有60万个条目.

I am trying to find corresponding keys in two different dictionaries. Each has about 600k entries.

举个例子:

    myRDP = { 'Actinobacter': 'GATCGA...TCA', 'subtilus sp.': 'ATCGATT...ACT' }
    myNames = { 'Actinobacter': '8924342' }

我想打印出放线杆菌(8924342)的值,因为它与myRDP中的值匹配.

I want to print out the value for Actinobacter (8924342) since it matches a value in myRDP.

以下代码可以运行,但是速度很慢:

The following code works, but is very slow:

    for key in myRDP:
        for jey in myNames:
            if key == jey:
                print key, myNames[key]

我尝试了以下操作,但始终会导致KeyError:

I've tried the following but it always results in a KeyError:

    for key in myRDP:
        print myNames[key]

也许在C中实现了执行此操作的功能?我已经四处搜寻,但似乎没有任何作用.

Is there perhaps a function implemented in C for doing this? I've googled around but nothing seems to work.

谢谢.

推荐答案

使用集合,因为它们具有内置的intersection方法,该方法应该很快:

Use sets, because they have a built-in intersection method which ought to be quick:

myRDP = { 'Actinobacter': 'GATCGA...TCA', 'subtilus sp.': 'ATCGATT...ACT' }
myNames = { 'Actinobacter': '8924342' }

rdpSet = set(myRDP)
namesSet = set(myNames)

for name in rdpSet.intersection(namesSet):
    print name, myNames[name]

# Prints: Actinobacter 8924342

这篇关于在两个大型词典中查找匹配键并快速完成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆