在python中每n个项目拆分一个生成器/可迭代(splitEvery) [英] split a generator/iterable every n items in python (splitEvery)

查看:33
本文介绍了在python中每n个项目拆分一个生成器/可迭代(splitEvery)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用 Python 编写 Haskell 函数splitEvery".这是它的定义:

splitEvery :: Int ->[e] ->[[e]]@'splitEvery' n@ 将列表分成长度为 n 的部分.最后如果@n@ 不均分长度,则块会更短列表.

这个的基本版本工作正常,但我想要一个可以处理生成器表达式、列表和迭代器的版本.并且,如果有一个生成器作为输入,它应该返回一个生成器作为输出!

测试

# 不应使用生成器或列表进入无限循环splitEvery(itertools.count(), 10)splitEvery(范围(1000),10)# 如果 n 不均分,最后一块必须更短断言 splitEvery(5, range(9)) == [[0, 1, 2, 3, 4], [5, 6, 7, 8]]# 应该给出与生成器相同的正确结果tmp = itertools.islice(itertools.count(), 10)断言列表(splitEvery(5, tmp)) == [[0, 1, 2, 3, 4], [5, 6, 7, 8]]

当前实施

这是我目前拥有的代码,但它不适用于简单列表.

def splitEvery_1(n, iterable):res = list(itertools.islice(iterable, n))而 len(res) != 0:产量资源res = list(itertools.islice(iterable, n))

这个不适用于生成器表达式(感谢 jellybean 修复它):

def splitEvery_2(n, iterable):返回 [iterable[i:i+n] for i in range(0, len(iterable), n)]

必须有一段简单的代码来进行拆分.我知道我可以有不同的功能,但似乎应该很容易做到.我可能被一个不重要的问题困住了,但这真的让我很烦恼.


它类似于来自 http://docs.python.org 的石斑鱼/library/itertools.html#itertools.groupby 但我不希望它填充额外的值.

def grouper(n, iterable, fillvalue=None):"石斑鱼(3, 'ABCDEFG', 'x') -->ABC DEF Gxx"args = [iter(iterable)] * n返回 izip_longest(fillvalue=fillvalue, *args)

它确实提到了一种截断最后一个值的方法.这也不是我想要的.

<块引用>

可迭代对象的从左到右的评估顺序是有保证的.这使得使用 izip(*[iter(s)]*n) 将数据系列聚类为 n 个长度的组成为可能.

list(izip(*[iter(range(9))]*5)) == [[0, 1, 2, 3, 4]]# 应该是 [[0, 1, 2, 3, 4], [5, 6, 7, 8]]

解决方案

from itertools import islicedef split_every(n, iterable):我 = 迭代(可迭代)片 = 列表(islice(i,n))而片:屈服件片 = 列表(islice(i,n))

一些测试:

<预><代码>>>>列表(split_every(5,范围(9)))[[0, 1, 2, 3, 4], [5, 6, 7, 8]]>>>list(split_every(3, (x**2 for x in range(20))))[[0, 1, 4], [9, 16, 25], [36, 49, 64], [81, 100, 121], [144, 169, 196], [225, 256, 289], [324, 361]]>>>[''.join(s) for s in split_every(6, 'Hello world')]['你好世界']>>>列表(split_every(100,[]))[]

I'm trying to write the Haskell function 'splitEvery' in Python. Here is it's definition:

splitEvery :: Int -> [e] -> [[e]]
    @'splitEvery' n@ splits a list into length-n pieces.  The last
    piece will be shorter if @n@ does not evenly divide the length of
    the list.

The basic version of this works fine, but I want a version that works with generator expressions, lists, and iterators. And, if there is a generator as an input it should return a generator as an output!

Tests

# should not enter infinite loop with generators or lists
splitEvery(itertools.count(), 10)
splitEvery(range(1000), 10)

# last piece must be shorter if n does not evenly divide
assert splitEvery(5, range(9)) == [[0, 1, 2, 3, 4], [5, 6, 7, 8]]

# should give same correct results with generators
tmp = itertools.islice(itertools.count(), 10)
assert list(splitEvery(5, tmp)) == [[0, 1, 2, 3, 4], [5, 6, 7, 8]]

Current Implementation

Here is the code I currently have but it doesn't work with a simple list.

def splitEvery_1(n, iterable):
    res = list(itertools.islice(iterable, n))
    while len(res) != 0:
        yield res
        res = list(itertools.islice(iterable, n))

This one doesn't work with a generator expression (thanks to jellybean for fixing it):

def splitEvery_2(n, iterable): 
    return [iterable[i:i+n] for i in range(0, len(iterable), n)]

There has to be a simple piece of code that does the splitting. I know I could just have different functions but it seems like it should be and easy thing to do. I'm probably getting stuck on an unimportant problem but it's really bugging me.


It is similar to grouper from http://docs.python.org/library/itertools.html#itertools.groupby but I don't want it to fill extra values.

def grouper(n, iterable, fillvalue=None):
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

It does mention a method that truncates the last value. This isn't what I want either.

The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using izip(*[iter(s)]*n).

list(izip(*[iter(range(9))]*5)) == [[0, 1, 2, 3, 4]]
# should be [[0, 1, 2, 3, 4], [5, 6, 7, 8]]

解决方案

from itertools import islice

def split_every(n, iterable):
    i = iter(iterable)
    piece = list(islice(i, n))
    while piece:
        yield piece
        piece = list(islice(i, n))

Some tests:

>>> list(split_every(5, range(9)))
[[0, 1, 2, 3, 4], [5, 6, 7, 8]]

>>> list(split_every(3, (x**2 for x in range(20))))
[[0, 1, 4], [9, 16, 25], [36, 49, 64], [81, 100, 121], [144, 169, 196], [225, 256, 289], [324, 361]]

>>> [''.join(s) for s in split_every(6, 'Hello world')]
['Hello ', 'world']

>>> list(split_every(100, []))
[]

这篇关于在python中每n个项目拆分一个生成器/可迭代(splitEvery)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆