bisect.insort复杂性与预期不符 [英] bisect.insort complexity not as expected

查看:389
本文介绍了bisect.insort复杂性与预期不符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图在 python3 中找到最理想的数据结构以解决我不得不解决的一个更严重的问题,这刚好意识到使用模块 bisect 来实现一个真正的问题的复杂性。按时间顺序排列的插入不是应有的O(nlog n),而是呈指数增长。不知道它的原因,所以感觉就像问你们,以防万一,因为我发现它真的很有趣,因为我发现它真的很有趣。

trying to find the most optimal data structure in python3 for a frotier problem i have to develop have just realised that the complexity of using the module bisect to make a real time ordered insert is not O(nlog n) as it should be and grows exponentially instead. do not know the reasoning of it so felt like asking u guys just in case know something about it since i find it really interesting.

认为我使用模块是正确的,所以对我来说应该不是问题,无论如何,这里是用于插入节点对象的代码,用于确定随机f值节点的插入。

think im using the module right so it shouldn't be a problem on my end, anyways here is the code used to insert node objects determining that insertion by a random f value nodes have.

bisect.insort(self._frontier, (node._f, node))

在几秒钟之内得到很多物体,但是随着时间的流逝就没有那么多了。 Bakuriu 建议我问这个问题,因为他在进行了一些测试并得到与我相同的结果后也发现它很有趣。他用来测试的代码如下:

getting lots of objects in a few seconds, but then not that many over time. Bakuriu suggested me asking this question since he also found it interesting after doing some tests and ending up with same results as me. the code he used to test that out was the following:

python3 -m timeit -s 'import bisect as B; import random as R;seq=[]' 'for _ in range(100000):B.insort(seq, R.randint(0, 1000000))'

这些是他的结论:


10k插入都很好(80ms和到那时为止,它基本上是线性缩放的(请记住,它是O(nlog n),所以它比线性缩放要差一些)),但是100k则需要花费永久的时间,而不是10倍。 100k元素的列表并不大,log(100k)为16,所以它并不大。

10k insertions is all fine (80ms and up to that point it basically scales linearly [keep in mind that it is O(nlog n) so it's a little bit worse than linear]) but with 100k it takes forever instead of 10 times more. A list of 100k elements isn't really that big and log(100k) is 16 so it's not that big.

任何帮助都会

推荐答案

您可能会错过 insort 为 O(n),这是显然,对于 bisect.insort_left()

You probably missed that the time complexity for insort is O(n) and this is documented clearly, for bisect.insort_left():


请注意,O(log n)搜索由缓慢的O(n)插入步骤主导。

Keep in mind that the O(log n) search is dominated by the slow O(n) insertion step.

查找插入点很便宜,但不会插入到Python列表中,因为插入点之后的元素必须向上移动。

Finding the insertion point is cheap, but inserting into a Python list is not, as the elements past the insertion point have to be moved up a step.

另请参见 Python Wiki上的TimeComplexity页面,其中记录了 list 插入:

Also see the TimeComplexity page on the Python Wiki, where list insertion is documented:


Ins ert O(n)

Insert O(n)

您可以在O(log n)时间找到插入点,但是后面的插入步骤是O (n),这使它成为一种相当昂贵的排序方式。

You can find the insertion point in O(log n) time, but the insertion step that follows is O(n), making this a rather expensive way to sort.

如果您使用此方法对 m 个元素进行排序,则您有一个O( m ^ 2)(二次)解决方案,使用TimSort( sorted()函数使用的排序算法)只需要O(m log m)时间。

If you are using this to sort m elements, you have a O(m^2) (quadratic) solution for what should only take O(m log m) time with TimSort (the sorting algorithm used by the sorted() function).

这篇关于bisect.insort复杂性与预期不符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆