如何不错过itertools.takewhile()之后的下一个元素 [英] How not to miss the next element after itertools.takewhile()

查看:53
本文介绍了如何不错过itertools.takewhile()之后的下一个元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我们希望处理一个迭代器,并希望通过块来处理它.
每个块的逻辑取决于先前计算的块,因此groupby()无济于事.

Say we wish to process an iterator and want to handle it by chunks.
The logic per chunk depends on previously-calculated chunks, so groupby() does not help.

在这种情况下,我们的朋友是itertools.takewhile():

Our friend in this case is itertools.takewhile():

while True:
    chunk = itertools.takewhile(getNewChunkLogic(), myIterator)
    process(chunk)

问题在于,takewhile()需要经过满足新块逻辑的最后一个元素,从而吃掉"下一个块的第一个元素.

The problem is that takewhile() needs to go past the last element that meets the new chunk logic, thus 'eating' the first element for the next chunk.

对此有多种解决方案,包括包装或àla C的ungetc()等.
我的问题是:是否有优雅解决方案?

There are various solutions to that, including wrapping or à la C's ungetc(), etc..
My question is: is there an elegant solution?

推荐答案

takewhile()确实需要查看下一个元素以确定何时切换行为.

takewhile() indeed needs to look at the next element to determine when to toggle behaviour.

您可以使用一个包装器来跟踪最后看到的元素,并且可以对其进行重置"以备份一个元素:

You could use a wrapper that tracks the last seen element, and that can be 'reset' to back up one element:

_sentinel = object()

class OneStepBuffered(object):
    def __init__(self, it):
        self._it = iter(it)
        self._last = _sentinel
        self._next = _sentinel
    def __iter__(self):
        return self
    def __next__(self):
        if self._next is not _sentinel:
            next_val, self._next = self._next, _sentinel
            return next_val
        try:
            self._last = next(self._it)
            return self._last
        except StopIteration:
            self._last = self._next = _sentinel
            raise
    next = __next__  # Python 2 compatibility
    def step_back(self):
        if self._last is _sentinel:
            raise ValueError("Can't back up a step")
        self._next, self._last = self._last, _sentinel

在将迭代器与takewhile()结合使用之前,将其包装在其中:

Wrap your iterator in this one before using it with takewhile():

myIterator = OneStepBuffered(myIterator)
while True:
    chunk = itertools.takewhile(getNewChunkLogic(), myIterator)
    process(chunk)
    myIterator.step_back()

演示:

>>> from itertools import takewhile
>>> test_list = range(10)
>>> iterator = OneStepBuffered(test_list)
>>> list(takewhile(lambda i: i < 5, iterator))
[0, 1, 2, 3, 4]
>>> iterator.step_back()
>>> list(iterator)
[5, 6, 7, 8, 9]

这篇关于如何不错过itertools.takewhile()之后的下一个元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆