访问Python dict的时间复杂度 [英] Time complexity of accessing a Python dict
问题描述
我的程序似乎遭受线性访问字典,
它的运行时间呈指数增长,即使算法是二次的。
我使用字典来记录值。这似乎是一个瓶颈。
我散列的值是元组元组。
每个点是:(x,y),0 <= x,y <= 50
字典中的每个键都是:一个2-5点的元组:((x1 ,y1),(x2,y2),(x3,y3),(x4,y4))
这些键的读取次数比写入次数多出多倍。
我正确的说,python dicts遭受线性访问时间与这样的输入?
据我所知,集合保证了对数访问时间。
如何在Python中使用set(或类似的东西)模拟dicts?
编辑根据请求,这是一个(简化)版本的记忆功能:
$($)
如果没有,则
key = args
密钥记录:
memoized [key] = fun(* args)
return memoized [key]
return memo
请参阅时间复杂性。 python dict是一个哈希函数,其最坏的情况是O(n),如果哈希函数不好,并导致大量的冲突。然而,这是一个非常罕见的情况,其中每个添加的项目都具有相同的哈希值,因此被添加到同一个链中,对于主要的Python实现将非常不太可能。平均时间复杂度当然是O(1)。
最好的方法是检查并查看您正在使用的对象的哈希值。 CPython Dict 使用 int PyObject_Hash(PyObject * o),相当于哈希(o)
。
快速检查后,我还没有设法找到两个散列到同一个值的元组,表示查找是O(1)
l = []
for x in range(0,50) :
for y in range(0,50):
如果hash((x,y))in l:
printFail:,(x,y)
l.append(hash((x,y)))
printTest Finished
CodePad (24小时可用)
I am writing a simple Python program.
My program seems to suffer from linear access to dictionaries,
its run-time grows exponentially even though the algorithm is quadratic.
I use a dictionary to memoize values. That seems to be a bottleneck.
The values I'm hashing are tuples of points.
Each point is: (x,y), 0 <= x,y <= 50
Each key in the dictionary is: A tuple of 2-5 points: ((x1,y1),(x2,y2),(x3,y3),(x4,y4))
The keys are read many times more often than they are written.
Am I correct that python dicts suffer from linear access times with such inputs?
As far as I know, sets have guaranteed logarithmic access times.
How can I simulate dicts using sets(or something similar) in Python?
edit As per request, here's a (simplified) version of the memoization function:
def memoize(fun):
memoized = {}
def memo(*args):
key = args
if not key in memoized:
memoized[key] = fun(*args)
return memoized[key]
return memo
See Time Complexity. The python dict is a hashmap, its worst case is therefore O(n) if the hash function is bad and results in a lot of collisions. However that is a very rare case where every item added has the same hash and so is added to the same chain which for a major Python implementation would be extremely unlikely. The average time complexity is of course O(1).
The best method would be to check and take a look at the hashs of the objects you are using. The CPython Dict uses int PyObject_Hash (PyObject *o) which is the equivalent of hash(o)
.
After a quick check, I have not yet managed to find two tuples that hash to the same value, which would indicate that the lookup is O(1)
l = []
for x in range(0, 50):
for y in range(0, 50):
if hash((x,y)) in l:
print "Fail: ", (x,y)
l.append(hash((x,y)))
print "Test Finished"
CodePad (Available for 24 hours)
这篇关于访问Python dict的时间复杂度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!