并行使用两个迭代器 [英] Consuming two iterators in parallel

查看:136
本文介绍了并行使用两个迭代器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有两个迭代器,并且我想计算

Suppose I have two iterators, and I want to compute

fancyoperation1(iter1), fancyoperation2(iter2)

通常,我只会使用fancyoperation1(iter1), fancyoperation2(iter2).但是,如果这些迭代器链接到单个源(也许来自单个迭代器的tee d),那么我不能在没有将大量临时数据保留在内存中的情况下做到这一点.在那种情况下,我知道几种选择:

Normally, I would simply use fancyoperation1(iter1), fancyoperation2(iter2). However, if these iterators are linked to a single source, perhaps teed from a single iterator, I can't do this without keeping a lot of temporary data in memory. In that case, I know of several options:

  • 我可以将fancyoperation1fancyoperation2重写为可以同时执行的单个函数,但是这可能会重复很多代码,并且我可能不理解或没有这两个函数的源代码.另外,对于每对操作,都需要重新进行此操作.
  • 我可以使用线程.同步可能只可以在辅助函数中编写一次,只要我不需要太频繁地切换线程,开销就不会太糟糕.
  • 我可以在内存中保留很多临时数据.
  • I could rewrite fancyoperation1 and fancyoperation2 into a single function that does both at the same time, but that may be a lot of code duplication, and I may not understand or have the source code for either function. Also, this would need to be done anew for every pair of operations.
  • I could use threading. The synchronization can probably be written once in a helper function, and the overhead probably wouldn't be too bad as long as I don't need to switch threads too often.
  • I could keep a lot of temporary data in memory.

不过,我不太喜欢这些选项的缺点.有没有一种方法可以在一个线程中执行我想要的操作,而无需重写内容或使用大量内存?我尝试使用协同程序来完成此操作,但是Python的yield似乎还不够强大.

I don't really like the drawbacks of those options, though. Is there a way to do what I want in one thread, without rewriting things or using large amounts of memory? I tried to do it with coroutines, but Python's yield doesn't seem to be powerful enough.

(我目前没有这个问题,但是我想知道如果出现问题该怎么办.)

(I do not currently have this problem, but I'm wondering what to do if it ever comes up.)

推荐答案

为此,您绝对可以使用协程,但它的便利性稍差一些(但从好的方面来说,您可以将它们分开,并且可以保留大多数代码).将幻想操作更改为无参数,然后重复使用yield(作为表达式)来获取数据,而不是接受参数并对其进行迭代.换句话说,更改此:

You absolutely can use coroutines for this, it's just slightly less convenient (but on the bright side, you can keep them separated and can leave most code unaltered). Change the fancy operations to be parameterless and repeatedly use yield (as expression) to fetch data instead of accepting a parameter and iterating over it. In other words, change this:

def fancyoperation1(it):
    for x in it:
        ...
    cleanup()

# into something like this

def fancyoperation1():
    while True:
        try:
            x = yield
        except GeneratorExit:
            break
        ...
    cleanup()

当然,如果没有迭代后的清理工作,则更容易. 您可以像这样使用这些(假设iter1, iter2 = tee(underlying_iter)):

Of course, it's easier if there is no post-iteration clean up to be done. You can use these like this (assuming iter1, iter2 = tee(underlying_iter)):

f1, f2 = fancyoperation1(), fancyoperation2()
f1.send(None) # start coroutines
f2.send(None)

for x in underlying_iterator:
    f1.send(x)
    f2.send(x)
f1.close()
f2.close()

这篇关于并行使用两个迭代器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆