最大和最小; Python的堆 [英] nlargest and nsmallest ; heapq python
问题描述
这是出于对python中heapq.py模块的最小和最大方法的好奇.
This is out of curiosity about the nsmallest and nlargest methods of heapq.py module in python.
我在文档中的此处阅读.
文档没有说明如何迭代(nsmalles/nlargest).
The documentation doesn't say how it does so (nsmalles/nlargest) on any iterable.
这可能是一个愚蠢的问题,但是我可以假设这些方法在内部创建可迭代数据结构的堆(可能使用"heapify"方法),然后返回n个最小/最大的元素吗?
This might be a stupid question, but can I assume that these methods internally create a heap of the iterable data structure (may be using 'heapify' method) and then return the n smallest/largest elements?
只想确认我的结论.谢谢!
Just want to confirm my conclusion. thanks!
推荐答案
从具有N
项的可迭代项中查找n
最小或最大项的算法有些棘手.您会发现,您没有创建大小为N
的最小堆来查找最小的项目.
The algorithm for finding the n
smallest or largest items from an iterable with N
items is a bit tricky. You see, you don't create a size-N
min-heap to find the smallest items.
相反,您使用前一个n
项目制作了一个较小的size- n
max-heap,然后对序列中的其余项目进行了pushpop
重复操作.完成后,您可以从堆中弹出项目,并以相反的顺序返回它们.
Instead, you make a smaller, size-n
max-heap with the first n
items, then do repeated pushpop
operations on it with the remaining items from the sequence. Once you're done, you pop the items from the heap and return them in reversed order.
此过程花费O(N log(n))
时间(注意较小的n
),当然只需要O(n)
空间.如果n
比N
小得多,那么它比排序和切片要有效得多.
This process take O(N log(n))
time (note the small n
) and of course only O(n)
space. If n
is much less than N
, it's much more efficient than sorting and slicing.
heapq
模块包含纯Python该算法的实现,尽管在导入时,您可能会获得用C编写的代码的更快版本(您可以阅读
The heapq
module contains a pure-Python implementation of this algorithm, though when you import it, you may get a faster version of the code written in C instead (you can read the source for that too, but it's not quite as friendly unless you know the Python C API).
这篇关于最大和最小; Python的堆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!