如何在不阻止事件循环的情况下遍历大型列表 [英] How to iterate over a large list without blocking event loop

查看:65
本文介绍了如何在不阻止事件循环的情况下遍历大型列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有正在运行的asyncio事件循环的python脚本,我想知道如何在不阻塞事件循环的情况下遍历较大的列表.从而保持循环运行.

I have a python script with a running asyncio event loop, I want to know how to iterate over a large list without blocking the event loop. Thus keeping the loop running.

我尝试使用__aiter____anext__创建一个自定义类,但该类无法正常工作,我还尝试创建一个产生结果但仍会阻塞的async function.

I've tried making a custom class with __aiter__ and __anext__ which did not work, I've also tried making an async function that yields the result but it still blocks.

当前:

for index, item in enumerate(list_with_thousands_of_items):
    # do something

我尝试过的自定义类:

class Aiter:
    def __init__(self, iterable):
        self.iter_ = iter(iterable)

    async def __aiter__(self):
        return self

    async def __anext__(self):
        try:
            object = next(self.iter_)
        except StopIteration:
            raise StopAsyncIteration
        return object

但这总是导致

TypeError: 'async for' received an object from __aiter__ that does not implement __anext__: coroutine

我制作的有效但仍然阻止事件循环的async function是:

The async function I made which works but still blocks the event loop is:

async def async_enumerate(iterable, start:int=0):
    for idx, i in enumerate(iterable, start):
        yield idx, i

推荐答案

正如@deceze指出的那样,您可以使用await asyncio.sleep(0)明确地将控制权传递给事件循环.但是,这种方法存在一些问题.

As @deceze pointed out, you can use await asyncio.sleep(0) to explicitly pass control to the event loop. There are problems with this approach, though.

估计列表很大,这就是为什么您需要采取特殊措施来解除事件循环阻塞的原因.但是,如果列表太大,则强制每次循环迭代产生事件循环会大大降低速度.当然,您可以通过添加计数器并仅在i%10 == 0i%100 == 0等时等待来缓解这种情况.但是随后,您必须对放弃控制的频率做出任意决定(猜测).如果您太频繁地屈服,则会降低功能.如果您很少屈服,则使事件循环无响应.

Presumably the list is quite large, which is why you needed special measures to unblock the event loop. But if the list is so large, forcing each loop iteration to yield to the event loop will slow it down considerably. Of course, you can alleviate that by adding a counter and only awaiting when i%10 == 0 or when i%100 == 0, etc. But then you have to make arbitrary decisions (guess) regarding how often to give up control. If you yield too often, you're slowing down your function. If you yield too seldom, you're making the event loop unresponsive.

可以通过使用 run_in_executor来避免,如RafaëlDera所建议. run_in_executor接受阻塞函数,并将其执行卸载到线程池中.它立即返回可以在asyncio中await进行处理的Future,其结果(一旦可用)将成为阻塞函数的返回值. (如果阻塞函数引发,则将传播异常.)此类await将挂起协程,直到该函数在其线程中返回或引发,从而允许事件循环在此期间保持完整功能.由于阻塞函数和事件循环在单独的线程中运行,因此该函数无需执行任何操作即可允许事件工作运行-它们独立运行.在这里,即使GIL也不是问题,因为GIL确保控件在线程之间传递.

This can be avoided by using run_in_executor, as suggested by RafaëlDera. run_in_executor accepts a blocking function and offloads its execution to a thread pool. It immediately returns a future that can be awaited in asyncio and whose result, once available, will be the return value of the blocking function. (If the blocking function raises, the exception will be propagated instead.) Such await will suspend the coroutine until the function returns or raises in its thread, allowing the event loop to remain fully functional in the meantime. Since the blocking function and the event loop run in separate threads, the function doesn't need to do anything to allow the event work to run - they operate independently. Even the GIL is not a problem here because GIL ensures that the control is passed between threads.

使用run_in_executor,您的代码应如下所示:

With run_in_executor your code could look like this:

def process_the_list():
    for index, item in enumerate(list_with_thousands_of_items):
        # do something

loop = asyncio.get_event_loop()
await loop.run_in_executor(None, process_the_list)

这篇关于如何在不阻止事件循环的情况下遍历大型列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆