识别列表中连续重复项的最 Pythonic 方法是什么? [英] What's the most Pythonic way to identify consecutive duplicates in a list?
问题描述
我有一个整数列表,我希望能够识别连续的重复块:也就是说,我想生成一个保留顺序的重复列表,其中每个重复包含(int_in_question,出现次数).
I've got a list of integers and I want to be able to identify contiguous blocks of duplicates: that is, I want to produce an order-preserving list of duples where each duples contains (int_in_question, number of occurrences).
例如,如果我有一个列表:
For example, if I have a list like:
[0, 0, 0, 3, 3, 2, 5, 2, 6, 6]
我希望结果是:
[(0, 3), (3, 2), (2, 1), (5, 1), (2, 1), (6, 2)]
我有一个相当的简单方法,可以使用 for 循环、温度和计数器来做到这一点:
I have a fairly simple way of doing this with a for-loop, a temp, and a counter:
result_list = []
current = source_list[0]
count = 0
for value in source_list:
if value == current:
count += 1
else:
result_list.append((current, count))
current = value
count = 1
result_list.append((current, count))
但我真的很喜欢 python 的函数式编程习惯用法,我希望能够用一个简单的生成器表达式来做到这一点.但是我发现在使用生成器时很难保持子计数.我有一种感觉,一个两步的过程可能会让我到达那里,但现在我被难住了.
But I really like python's functional programming idioms, and I'd like to be able to do this with a simple generator expression. However I find it difficult to keep sub-counts when working with generators. I have a feeling a two-step process might get me there, but for now I'm stumped.
有没有特别优雅/pythonic 的方法来做到这一点,尤其是使用生成器?
Is there a particularly elegant/pythonic way to do this, especially with generators?
推荐答案
>>> from itertools import groupby
>>> L = [0, 0, 0, 3, 3, 2, 5, 2, 6, 6]
>>> grouped_L = [(k, sum(1 for i in g)) for k,g in groupby(L)]
>>> # Or (k, len(list(g))), but that creates an intermediate list
>>> grouped_L
[(0, 3), (3, 2), (2, 1), (5, 1), (2, 1), (6, 2)]
包括电池,正如他们所说.
关于使用 JBernardo 的 sum
和生成器表达式的建议;见评论.
Suggestion for using sum
and generator expression from JBernardo; see comment.
这篇关于识别列表中连续重复项的最 Pythonic 方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!