Python：使用dict来加快对元组列表的排序 [英] Python: using a dict to speed sorting of a list of tuples

查看：127 发布时间：2017/5/21 21:58:03 python sorting dictionary

本文介绍了Python：使用dict来加快对元组列表的排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

由于某种原因，我一直在如何排序这个元组列表问题。（我以前的一个问题：按任意键排列元组的列表）

这是一些任意的原始输入：

  number_of = 3＃或任何
 tuple_list = [（n，'a'，'b'，'c'）for n in xrange（number_of）]＃[（0，'a'，'b'，'c' ）...] 
 ordering_list = random.sample（range（number_of），number_of）＃eg [1，0，2]

排序 tuple_list 由 ordering_list 使用排序：

  ordered = sorted（tuple_list，key = lambda t：ordering_list.index（t [0]））
＃ordered = [（1，'a'，'b'，'c'），（0，'a'，'b' c'），（2，'a'，'b'，'c'）]

有一个稍微尴尬的做法似乎要快得多，特别是当 tuple_list 中的元素数量增加时。我创建一个字典，将元组分解成（tuple [0]，tuple [1：]） list_dict 。我使用 ordering_list 作为键检索字典项，然后重新组合（tuple [0]，tuple [1：]）到一个元组列表中，使用一个成语我仍然试图完全包围我的头： zip（* [iter（_list）] * x）其中 x 是由 _list 中的项组成的每个元组的长度。所以我的问题是：是否有一个版本的这种方法是管理拆卸 - 更好地重新组装部分代码？

  def gen_key_then_values（key_list，list_dict）：
 key_list中的键：
 values = list_dict [key] 
 yield key 
 
在值中的n：
 yield ñ
 
 list_dict = {T [0]：T [1：用于在tuple_list吨} 
有序=拉链（* [gen_key_then_values（ordering_list，list_dict）] * 4）

注意更好的代码，使用来自Steve Jessop的一个明显的评论：

'pre>

 list_dict = {T [0]：吨用于tuple_list吨} 
有序= [list_dict [K]对于k在ordering_list] 
 （k，['a'，'b']组合一个元组，c $ c>

$ b $ ...]）从 list_dict 中检索的项目，但是我没有理由将这部分代码包含在这里。

解决案

在字典打破 tuple_list 的元素分开并没有真正获得你什么，需要创造一堆的价值观更多元组。所有你正在做的是根据他们的第一个元素查找列表中的元素，所以它可能不值得实际拆分：

  list_dict = {T [0]：吨用于tuple_list吨}

请注意，这仅适用于如果第一个元素是唯一的，但是如果第一个元素是唯一的，那么 ordering_list 只有这样才是有意义的，所以这很可能。

zip（* [iter（_list）] * 4）只是将 _list 分组为四，所以给它一个合适的名字，你会不会担心它：

 高清fixed_size_groups（N，迭代器）： 
 return zip（* [iter（iterable）] * n）

但是，你实际上并不需要它无论如何：

 有序=名单（list_dict [VAL]为VAL在ordering_list）

您的第一个代码缓慢的原因是 ordering_list.index ordering_list 为 t [0] ，它为每个 T 。所以总是会对列表元素执行（number_of ** 2）/ 2 。

For some reason, I keep having 'how do I sort this list of tuples' questions. (A prior question of mine: sorting list of tuples by arbitrary key).

Here is some arbitrary raw input:

number_of = 3  # or whatever
tuple_list = [(n, 'a', 'b', 'c') for n in xrange(number_of)]  # [(0, 'a', 'b', 'c')...]
ordering_list = random.sample(range(number_of), number_of)  # e.g. [1, 0, 2]

Sorting tuple_list by ordering_list using sorted:

ordered = sorted(tuple_list, key=lambda t: ordering_list.index(t[0]))
# ordered = [(1, 'a', 'b', 'c'), (0, 'a', 'b', 'c'), (2, 'a', 'b', 'c')]

I have a slightly awkward approach which seems to be much faster, especially as the number of elements in the tuple_list grows. I create a dictionary, breaking the tuple into (tuple[0], tuple[1:]) items inside dictionary list_dict. I retrieve the dictionary item using ordering_list as keys, and then re-assemble the sequence of (tuple[0], tuple[1:]) into a list of tuples, using an idiom I'm still trying to wrap my head around completely: zip(*[iter(_list)] * x) where x is the length of each tuple composed of items from _list. So my question is: is there a version of this approach which is manages the disassemble - reassemble part of the code better?

def gen_key_then_values(key_list, list_dict):
    for key in key_list:
        values = list_dict[key]
        yield key

        for n in values:
            yield n

list_dict = {t[0]: t[1:] for t in tuple_list}
ordered = zip(*[gen_key_then_values(ordering_list, list_dict)] * 4)

NOTE BETTER CODE, using an obvious comment from Steve Jessop below:

list_dict = {t[0]: t for t in tuple_list}
ordered = [list_dict[k] for k in ordering_list]

My actual project code still requires assembling a tuple for each (k, ['a', 'b' ...]) item retrieved from the list_dict but there was no reason for me to include that part of the code here.

解决方案

Breaking the elements of tuple_list apart in the dictionary doesn't really gain you anything and requires creating a bunch more tuples for the values. All you're doing is looking up elements in the list according to their first element, so it's probably not worth actually splitting them:

list_dict = { t[0] : t for t in tuple_list }

Note that this only works if the first element is unique, but then the ordering_list only makes sense if the first element is unique, so that's probably OK.

zip(*[iter(_list)] * 4) is just a way of grouping _list into fours, so give it a suitable name and you won't have to worry about it:

def fixed_size_groups(n, iterable):
    return zip(*[iter(iterable)] * n)

But all things considered you don't actually need it anyway:

ordered = list(list_dict[val] for val in ordering_list)

The reason your first code is slow, is that ordering_list.index is slow -- it searches through the ordering_list for t[0], and it does this once for each t. So in total it does (number_of ** 2) / 2 inspections of a list element.

这篇关于Python：使用dict来加快对元组列表的排序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python：使用dict来加快对元组列表的排序 [英] Python: using a dict to speed sorting of a list of tuples

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python：使用dict来加快对元组列表的排序 [英] Python: using a dict to speed sorting of a list of tuples

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭