什么是Python的heapq模块? [英] What is Python's heapq module?

查看:89
本文介绍了什么是Python的heapq模块?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试了 heapq ,得出的结论是我的期望与众不同从我在屏幕上看到的。我需要有人解释它是如何工作的以及在什么地方有用。

I tried "heapq" and arrived at the conclusion that my expectations differ from what I see on the screen. I need somebody to explain how it works and where it can be useful.

摘录自每周Python模块 2.2排序段落下


如果在添加和删除值时需要维护排序列表,则
签出heapq。通过使用heapq中的函数从列表中添加或删除
项,您可以以
的低开销维护列表的排序顺序。

If you need to maintain a sorted list as you add and remove values, check out heapq. By using the functions in heapq to add or remove items from a list, you can maintain the sort order of the list with low overhead.

这就是我要做的事情。

import heapq
heap = []

for i in range(10):
    heap.append(i)

heap
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

heapq.heapify(heap)    
heapq.heappush(heap, 10)    
heap
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

heapq.heappop(heap)
0    
heap
[1, 3, 2, 7, 4, 5, 6, 10, 8, 9] <<< Why the list does not remain sorted?

heapq.heappushpop(heap, 11)
1
heap
[2, 3, 5, 7, 4, 11, 6, 10, 8, 9] <<< Why is 11 put between 4 and 6?

因此,正如您看到的那样,堆列表根本没有排序,实际上您越多添加和删​​除项目变得越混乱。推入的值占据无法解释的位置。
这是怎么回事?

So, as you see the "heap" list is not sorted at all, in fact the more you add and remove the items the more cluttered it becomes. Pushed values take unexplainable positions. What is going on?

推荐答案

heapq 模块维护堆不变,这与按排序顺序维护实际的列表对象不同。

The heapq module maintains the heap invariant, which is not the same thing as maintaining the actual list object in sorted order.

heapq 文档


堆是二叉树,其每个父节点的值都小于或等于其任何子节点的值。此实现使用 heap [k]< =堆[2 * k + 1] heap [k]< =堆[对于所有 k 2 * k + 2] ,从零开始计数元素。为了进行比较,不存在的元素被认为是无限的。堆的有趣特性是它的最小元素始终是根 heap [0]

Heaps are binary trees for which every parent node has a value less than or equal to any of its children. This implementation uses arrays for which heap[k] <= heap[2*k+1] and heap[k] <= heap[2*k+2] for all k, counting elements from zero. For the sake of comparison, non-existing elements are considered to be infinite. The interesting property of a heap is that its smallest element is always the root, heap[0].

这意味着查找最小的元素(仅占用 heap [0] )非常有效,这对于优先级队列非常有用。之后,接下来的2个值将大于(或等于)第一个值,之后的接下来的4个值将大于其父节点,然后接下来的8个值等等。

This means that it is very efficient to find the smallest element (just take heap[0]), which is great for a priority queue. After that, the next 2 values will be larger (or equal) than the 1st, and the next 4 after that are going to be larger than their 'parent' node, then the next 8 are larger, etc.

您可以在文档的理论部分。您还可以观看这是MIT OpenCourseWare算法入门课程的演讲,它以一般术语解释了算法。

You can read more about the theory behind the datastructure in the Theory section of the documentation. You can also watch this lecture from the MIT OpenCourseWare Introduction to Algorithms course, which explains the algorithm in general terms.

可以回退堆进入排序列表非常有效:

A heap can be turned back into a sorted list very efficiently:

def heapsort(heap):
    return [heapq.heappop(heap) for _ in range(len(heap))]

,只需从堆中弹出下一个元素即可。但是,使用 sorted(heap)应该仍然更快,因为Python的sort使用的TimSort算法将利用堆中已经存在的部分排序。

by just popping the next element from the heap. Using sorted(heap) should be faster still, however, as the TimSort algorithm used by Python’s sort will take advantage of the partial ordering already present in a heap.

如果只对最小值感兴趣,或者对前一个 n 最小值感兴趣,则可以使用堆,特别是如果您对对这些价值不断感兴趣;添加新项并删除最小项确实非常有效,比每次添加值时都诉诸列表要好得多。

You'd use a heap if you are only interested in the smallest value, or the first n smallest values, especially if you are interested in those values on an ongoing basis; adding new items and removing the smallest is very efficient indeed, more so than resorting the list each time you added a value.

这篇关于什么是Python的heapq模块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆