暂停Python生成器 [英] Pause Python Generator

查看:72
本文介绍了暂停Python生成器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个能够正常工作的python生成器,该生成器会生成大量数据,并占用大量内存.有没有一种方法可以检测使用生成器的代码是否已消耗"处理过的数据,如果是,则暂停直到使用完该数据为止?

I have a python generator that does work that produces a large amount of data, which uses up a lot of ram. Is there a way of detecting if the processed data has been "consumed" by the code which is using the generator, and if so, pause until it is consumed?

def multi_grab(urls,proxy=None,ref=None,xpath=False,compress=True,delay=10,pool_size=50,retries=1,http_obj=None):
    if proxy is not None:
        proxy = web.ProxyManager(proxy,delay=delay)
        pool_size = len(pool_size.records)
    work_pool = pool.Pool(pool_size)
    partial_grab = partial(grab,proxy=proxy,post=None,ref=ref,xpath=xpath,compress=compress,include_url=True,retries=retries,http_obj=http_obj)
    for result in work_pool.imap_unordered(partial_grab,urls):
        if result:
            yield result

运行于:

if __name__ == '__main__':
    links = set(link for link in grab('http://www.reddit.com',xpath=True).xpath('//a/@href') if link.startswith('http') and 'reddit' not in link)
    print '%s links' % len(links)
    counter = 1
    for url, data in multi_grab(links,pool_size=10):
        print 'got', url, counter, len(data)
        counter += 1

推荐答案

生成器仅产生值.生成器无法知道它们正在做什么.

A generator simply yields values. There's no way for the generator to know what's being done with them.

但是生成器也会不断暂停,因为调用者会做任何事情.直到调用者调用它以获取下一个值时,它才会再次执行.它不会在单独的线程或其他任何线程上运行.听起来您对发电机的工作方式有误解.您可以显示一些代码吗?

But the generator also pauses constantly, as the caller does whatever it does. It doesn't execute again until the caller invokes it to get the next value. It doesn't run on a separate thread or anything. It sounds like you have a misconception about how generators work. Can you show some code?

这篇关于暂停Python生成器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆