如何对Python列表进行部分排序? [英] How can I partially sort a Python list?

查看:472
本文介绍了如何对Python列表进行部分排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个用于MSVC的编译器缓存(很像 gcc ).我要做的一件事是删除缓存目录中最旧的目标文件,以将缓存调整为用户定义的大小.

I wrote a compiler cache for MSVC (much like ccache for gcc). One of the things I have to do is to remove the oldest object files in my cache directory to trim the cache to a user-defined size.

现在,我基本上有一个元组列表,每个元组是最后访问时间和文件大小:

Right now, I basically have a list of tuples, each of which is the last access time and the file size:

# First tuple element is the access time, second tuple element is file size
items = [ (1, 42341),
          (3, 22),
          (0, 3234),
          (2, 42342),
          (4, 123) ]

现在,我想在此列表上进行 partial 排序,以便对前N个元素进行排序(其中N是元素的数量,因此它们的大小之和超过45000).结果基本上应该是这样的:

Now I'd like to do a partial sort on this list so that the first N elements are sorted (where N is the number of elements so that the sum of their sizes exceeds 45000). The result should be basically this:

# Partially sorted list; only first two elements are sorted because the sum of
# their second field is larger than 45000.
items = [ (0, 3234),
          (1, 42341),
          (3, 22),
          (2, 42342),
          (4, 123) ]

我不太在乎未排序条目的顺序,我只需要列表中N个最旧的项,其累积大小超过某个值即可.

I don't really care about the order of the unsorted entries, I just need the N oldest items in the list whose cumulative size exceeds a certain value.

推荐答案

您可以使用 模块.呼叫列表中的heapify(),然后呼叫heappop(),直到满足您的条件. heapify()是线性的,而heappop()是对数的,因此它可能会尽快获得.

You could use the heapq module. Call heapify() on the list, followed by heappop() until your condition is met. heapify() is linear and heappop() logarithmic, so it's likely as fast as you can get.

heapq.heapify(items)
size = 0
while items and size < 45000:
  item = heapq.heappop(items)
  size += item[1]
  print item

输出:

(0, 3234)
(1, 42341)

这篇关于如何对Python列表进行部分排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆