为什么我的dict查询没有比Python中的列表查询快? [英] Why isn't my dict lookup faster than my list lookup in Python?

查看:151
本文介绍了为什么我的dict查询没有比Python中的列表查询快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将文件的每一行都读入列表和字典,

I'm reading each line of a file into both a list and a dict,

with open("../data/title/pruned2_titleonly.txt", 'rb') as f_titles:
    titles_lst = f_titles.read().split('\n')
assert titles_lst[-1] == ''
titles_lst.pop() # remove the last element, an empty string

titles_dict = {}
with open("../data/title/pruned2_titleonly.txt", 'rb') as f_titles:
    for i,line in enumerate(f_titles):
        titles_dict[i] = line

我正在通过随机访问列表/字典中的每个项目来测试性能:

and I'm testing the performance by accessing each item in the list/dict in random order:

n = len(titles_lst)
a = np.random.permutation(n)

%%time
for i in xrange(10):
    t = []
    for b in a:
        t.append(titles_lst[b])
    del t

>>> CPU times: user 18.2 s, sys: 60 ms, total: 18.2 s
>>> Wall time: 18.1 s

%%time
for i in xrange(10):
    t = []
    for b in a:
        t.append(titles_dict[b])
    del t

>>> CPU times: user 41 s, sys: 208 ms, total: 41.2 s
>>> Wall time: 40.9 s

以上结果似乎暗示字典不如查找表的列表有效,即使列表查找为O(n)而字典查找为O(1).我已经测试了以下内容,以查看O(n)/O(1)的性能是否正确……事实并非如此……

The above result seems to imply that dictionaries are not as efficient as lists for lookup tables, even though list lookups are O(n) while dict lookups are O(1). I've tested the following to see if the O(n)/O(1) performance was true... turns out it isn't...

%timeit titles_lst[n/2]
>>> 10000000 loops, best of 3: 81 ns per loop

%timeit titles_dict[n/2]
>>> 10000000 loops, best of 3: 120 ns per loop

这是怎么回事?值得注意的是,我在Ubuntu 12.04下使用Python 2.7.6 Anaconda发行版,并在Intel MKL下构建了NumPy.

What is the deal? If it's important to note, I am using Python 2.7.6 Anaconda distribution under Ubuntu 12.04, and I built NumPy under Intel MKL.

推荐答案

以上结果似乎暗示字典效率不高 作为查找表的列表,即使列表查找为O(n)时 字典查找为O(1).我已经测试了以下内容,看看是否 O(n)/O(1)的性能是真实的...事实并非如此……

The above result seems to imply that dictionaries are not as efficient as lists for lookup tables, even though list lookups are O(n) while dict lookups are O(1). I've tested the following to see if the O(n)/O(1) performance was true... turns out it isn't...

从获取项目"的意义上来说,字典查找是O(N)是不正确的,这就是您的代码似乎要测试的意义.确定元素存在的位置(如果有的话)可能是O(N),例如somelist.index(someval_not_in_the_list)someval_not_in_the_list in somelist都必须扫描每个元素.尝试将x in somelistx in somedict进行比较,以了解主要差异.

It's not true that dict lookups are O(N), in the sense of "getting an item" which is the sense your code seems to test. Determining where (if at all) an element exists could be O(N), e.g. somelist.index(someval_not_in_the_list) or someval_not_in_the_list in somelist will both have to scan over each element. Try comparing x in somelist with x in somedict to see a major difference.

但是仅访问somelist[index]就是O(1)(请参见时间复杂度页面).而且系数可能会比字典中的系数(也就是O(1))小,因为您不必对键进行哈希处理.

But simply accessing somelist[index] is O(1) (see the Time Complexity page). And the coefficient is probably going to be smaller than in the case of a dictionary, also O(1), because you don't have to hash the key.

这篇关于为什么我的dict查询没有比Python中的列表查询快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆