协程中的Python循环 [英] Python loop in a coroutine

查看:64
本文介绍了协程中的Python循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了有关该主题的所有文档,但似乎我无法很好地理解Python协程的整个概念以实现我想做的事情.

I've read all the documentation on the subject, but it seems I can't grasp the whole concept of Python coroutines well enough to implement what I want to do.

我有一个后台任务(它会生成一些随机文件,但这没什么大不了的),并且它是在无限循环中完成的(这是一个观察者).

I have a background task (which generates some random files, but that doesn't much matter), and it does this in an infinite loop (this is a watcher).

我想以最有效的方式实现此后台任务,我认为微线程(又名协程)是实现这一目标的好方法,但我根本无法使其正常工作(要么后台任务运行,或者运行程序的其余部分,但不能同时运行!).

I would like to implement this background task in the most efficient way possible, and I thought that microthreads (aka coroutines) were a good way to achieve that, but I can't get it to work at all (either it the background task runs or either the rest of the program, but not both at the same time!).

有人可以给我一个使用协程实现后台任务的简单示例吗?还是我误以为协程可以用于此目的?

Could someone give me a simple example of a background task implemented using coroutines? Or am I being mistaken in thinking that coroutines could be used for that purpose?

我正在使用Python 2.7本机协程.

I am using Python 2.7 native coroutines.

我非常了解并发,尤其是对DBMSes和Ada而言,我对并发非常了解,但是我对底层原理了解很多,但是我不习惯对我来说很陌生的生成器协程概念.

I am well versed into concurrency, particularly with DBMSes and Ada, so I know a lot about the underlying principles, but I'm not used to the generator-as-coroutines concept which is very new to me.

/这是我的代码示例,我必须再次强调它无法正常工作:

/ here is a sample of my code, which I must emphasize again is not working:

@coroutine
def someroutine():
    with open('test.txt', 'a') as f:
        f.write('A')
    while True:
        pass
    yield 0

@coroutine
def spawnCoroutine():
    result = yield someroutine()

    yield result

routine = spawnCoroutine()
print 'I am working in parallel!'

# Save 'A' in the file test.txt, but does not output 'I am working in parallel!'

注意:@coroutine是David Beazley提供的 coroutine.py 中的装饰器

Note: @coroutine is a decorator from coroutine.py provided by David Beazley

/最终编辑和解决方案回顾

好吧,我的问题已经结束,因为它似乎模棱两可,事实上, 是我问题的真正目的:阐明协程在线程和多处理上的用法.

Ok my question was closed because it was seemingly ambiguous, which as a matter of fact is the very purpose of my question: to clarify the usage of Coroutines over Threading and Multiprocessing.

幸运的是,在可怕的制裁发生之前,提交了一个很好的答案!

Luckily, a nice answer was submitted before the dreadly sanction occurred!

要强调上述问题的答案:不,Python的协程(也不是bluelet/greenlet)不能用于运行独立的,可能无限的CPU约束任务,因为协程不存在并行性.

To emphasize the answer to the above question: no, Python's coroutines (nor bluelet/greenlet) can't be used to run an independent, potentially infinite CPU-bound task, because there is no parallelism with coroutines.

这是让我最困惑的地方.的确,并行性是并发的一个子集,因此,当前的协程实现很令人困惑.在Python中,允许并发任务,但不允许并行任务!这种行为要与Ada等并发编程语言的Tasks概念清楚地区分.

This is what confused me the most. Indeed, parallelism is a subset of concurrency, and thus it is rather confusing that the current implementation of coroutines in Python allow for concurrent tasks, but not for parallel tasks! This behaviour is to be clearly differentiated with the Tasks concept of concurrent programming languages such as Ada.

此外,Python的线程与协程类似,因为它们通常在等待I/O时切换上下文,因此也不适合用于独立的CPU绑定任务(请参阅David Beazley的了解GIL).

Also, Python's Threads are similar to coroutines in the fact that they generally switch context when waiting for I/O, and thus are also not a good candidate for independent CPU-bound tasks (see David Beazley's Understanding the GIL).

我当前使用的解决方案是使用 multiprocessing 模块生成子流程.生成后台进程很繁重,但是总比什么都不运行好.这还具有允许进行分布式计算的优点.

The solution I'm currently using is to spawn subprocesses with the multiprocessing module. Spawning background processes is heavy, but it's better than running nothing at all. This also has the advantage to allow for distributing computation.

或者,在Google App Engine上,有一个延迟的模块 background_thread模块,它们可以为多处理提供有趣的替代方法(针对例如,使用一些实现Google App Engine API的库,例如 typhoonae ,尽管我我不确定他们是否已经实现了这些模块.

As an alternative, on Google App Engine, there are the deferred module and the background_thread module which can offer interesting alternatives to multiprocessing (for example by using some of the libraries that implement the Google App Engine API like typhoonae, although I'm not sure they have yet implemented these modules).

推荐答案

如果您查看正在使用的(平凡的) coroutine.py 库,则它包含一个示例,该示例显示了 grep 在后台"工作.您的代码和示例之间有两个区别:

If you look at the (trivial) coroutine.py library you're using, it includes an example that shows how grep works "in the background". There are two differences between your code and the example:

  1. grep 在执行工作时重复 yield s-实际上,它每行一次 yield .您必须执行此操作,否则您的协程将无法运行,直到完成为止.

  1. grep repeatedly yields while doing its work—in fact, it yields once per line. You have to do this, or nobody but your coroutine gets a chance to run until it's finished.

主代码重复调用 grep 协程上的 send ,每行一次.您必须这样做,否则协程永远不会被调用.

the main code repeatedly calls send on the grep coroutine, again once per line. You have to do this, or your coroutines never get called.

这大约是一个琐碎的情况-一个协程,一个琐碎的调度程序只是无条件地驱动一个协程.

This is about as trivial a case as possible—a single coroutine, and a trivial dispatcher that just unconditionally drives that one coroutine.

这是将示例转换为可行的方式:

Here's how you could translate your example into something that works:

@coroutine
def someroutine():
    with open('test.txt', 'a') as f:
        yield
        f.write('A')
    while True:
        yield
    yield 0

routine = someroutine()
print 'I am working in parallel!'
routine.send()
print 'But only cooperatively...'
routine.send()

以此类推.

但是通常您不想这样做.在 grep 示例中,协程和主驱动程序明确地作为消费者和生产者合作,因此直接耦合非常有意义.您只有一些完全独立的任务要独立安排.

But normally you don't want to do this. In the case of the grep example, the coroutine and the main driver are explicitly cooperating as a consumer and producer, so that direct coupling makes perfect sense. You just have some completely independent tasks that you want to schedule independently.

为此,请不要尝试自己构建线程.如果要使用协作线程,请使用现成的调度程序/调度程序,而您对所有任务所做的唯一更改就是经常进行 yield 调用,以有效地共享时间.

To do that, don't try to build threading yourself. If you want cooperative threading, use an off-the-shelf dispatcher/scheduler, and the only change you have to make to all of your tasks is to put in yield calls often enough to share time effectively.

如果您甚至不关心线程是否协作,只需使用 threading multiprocessing ,甚至不需要 yield :

If you don't even care about the threading being cooperative, just use threading or multiprocessing, and you don't even need the yields:

def someroutine():
    with open('test.txt', 'a') as f:
        f.write('A')
    while True:
        pass
    return 0

routine = threading.Thread(someroutine)
print 'I am working in parallel!'

正如我在评论之一中所述,如果您尚未通过 http://www.dabeaz.com/coroutines/index.html 或同等功能,您确实应该这样做,并在处理过程中遇到任何问题,而不要编写您不理解和无法理解的代码问为什么它不起作用.我敢打赌,如果您进入第4部分(可能甚至更早),您会明白为什么最初的问题很愚蠢.

PS, as I said in one of the comments, if you haven't worked through http://www.dabeaz.com/coroutines/index.html or an equivalent, you really should do that, and come back with any questions you find along the way, instead of writing code that you don't understand and asking why it doesn't work. I'm willing to bet that if you make it to part 4 (probably even earlier), you'll see why your initial question was silly.

这篇关于协程中的Python循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆