Python的异步和CPU密集型任务? [英] Python async and CPU-bound tasks?

查看:682
本文介绍了Python的异步和CPU密集型任务?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近一直在使用烧瓶工作的一个宠物项目的蟒蛇。这是一个简单的引擎收录与服务器端语法高亮与Pygments来做支持。因为这是一个昂贵的任务,我委派语法高亮芹菜任务队列,并在请求处理我等待它完成。不用说,这并不比减轻CPU使用率对另一个人更多,因为等待结果仍然锁定到Web服务器的连接。
尽管我的直觉告诉我,以避免premature优化瘟疫一样,我还是控制不住自己,从寻找到异步。

I have recently been working on a pet project in python using flask. It is a simple pastebin with server-side syntax highlighting support with pygments. Because this is a costly task, I delegated the syntax highlighting to a celery task queue and in the request handler I'm waiting for it to finish. Needless to say this does no more than alleviate CPU usage to another worker, because waiting for a result still locks the connection to the webserver. Despite my instincts telling me to avoid premature optimization like the plague, I still couldn't help myself from looking into async.

异步

如果最近一直下面的Python Web开发,你一定已经看到,异步无处不在。什么异步不被带回合作多任务,这意味着每个线程决定何时,何地屈服于另一个。这种非preemptive过程比操作系统线程更高效,但仍然有它的缺点。目前似乎有两个主要的方法:

If have been following python web development lately, you surely have seen that async is everywhere. What async does is bringing back cooperative-multitasking, meaning each "thread" decides when and where to yield to another. This non-preemptive process is more efficient than OS-threads, but still has it's drawbacks. At the moment there seem to be 2 major approaches:


  • 事件/回调风格的多任务

  • 协同程序

第一个通过一个事件循环执行的松耦合组件提供并发。虽然这是更安全的相对于竞争条件,并提供更多的一致性,这是相当少的直观和更难code比preemptive多任务处理。

The first one provides concurrency through loosely-coupled components executed in an event loop. Although this is safer with respect to race conditions and provides for more consistency, it is considerably less intuitive and harder to code than preemptive multitasking.

另一种是一种更传统的解决方案,更靠近螺纹编程风格,程序员仅具有手动切换上下文。虽然越来越多的容易出现竞争条件和死锁,它提供了一个简单的插入式解决方案。

The other one is a more traditional solution, closer to threaded programming style, the programmer only having to manually switch context. Although more prone to race-conditions and deadlocks, it provides an easy drop-in solution.

目前大多数异步工作在所谓的 IO绑定任务,任务阻塞等待输入或输出完成。这通常是通过使用轮询的完成和超时基于可调用的函数,如果他们返回负,上下文可以被切换。

Most async work at the moment is done on what is known as IO-bound tasks, tasks that block to wait for input or output. This is usually accomplished through the use of polling and timeout based functions that can be called and if they return negatively, context can be switched.

尽管名字,这可以应用到 CPU绑定任务也可以委托给另一名工人(线程,进程等),然后非blockingly等待屈服。理想情况下,这些任务将被写在一个异步友好的方式,但实际上这将意味着分离code成足够小的块不阻塞,preferably无散射背景下code的每一行后切换。这对于现有的同步库特别不方便。

Despite the name, this could be applied to CPU-bound tasks too, which can be delegated to another worker(thread, process, etc) and then non-blockingly waited for to yield. Ideally, these tasks would be written in an async-friendly manner, but realistically this would imply separating code into small enough chunks not to block, preferably without scattering context switches after every line of code. This is especially inconvenient for existing synchronous libraries.

由于方便,我定居在使用异步工作GEVENT,并想知道如何与在异步环境CPU密集型任务处理(利用期货,芹菜等?)。

Due to the convenience, I settled on using gevent for async work and was wondering how is to be dealt with CPU-bound tasks in an async environment(using futures, celery, etc?).

如何与传统的Web框架,如烧瓶使用异步执行模型(在这种情况下GEVENT)?什么是一些常用商定在Python这些问题的解决方案(期货,任务队列)?

How to use async execution models(gevent in this case) with traditional web frameworks such as flask? What are some commonly agreed-upon solutions to these problems in python(futures, task queues)?

编辑::要更具体 - 如何使用GEVENT与瓶中,如何处理在这种情况下CPU密集型任务

To be more specific - How to use gevent with flask and how to deal with CPU-bound tasks in this context?

EDIT2:考虑Python的具有prevents螺纹code的最佳执行GIL,这种只留下多重选择,在我的情况下,至少。这意味着,无论是使用的 concurrent.futures 的或一些其他外部服务处理处理(可以打开大门,甚至一些语言无关)。你会在这种情况下,与GEVENT一些流行或经常使用的解决方案(芹菜)? - 最佳实践

Considering how Python has the GIL which prevents optimal execution of threaded code, this leaves only the multiprocessing option, in my case at least. This means either using concurrent.futures or some other external service dealing with processing(can open the doors for even something language agnostic). What would, in this case, be some popular or often-used solutions with gevent(i.e. celery)? - best practices

推荐答案

这应该是线程安全的做类似下面的单独CPU密集型任务分为异步线程:

It should be thread-safe to do something like the following to separate cpu intensive tasks into asynchronous threads:

from threading import Thread

def send_async_email(msg):
    mail.send(msg)

def send_email(subject, sender, recipients, text_body, html_body):
    msg = Message(subject, sender = sender, recipients = recipients)
    msg.body = text_body
    msg.html = html_body
    thr = Thread(target = send_async_email, args = [msg])
    thr.start()

如果你需要更复杂的东西,那么也许瓶,西芹或多重库池可能是对你有用。

IF you need something more complicated, then perhaps Flask-Celery or Multiprocessing library with "Pool" might be useful to you.

我不是太熟悉GEVENT虽然我不能想象更多的复杂性,可能需要或者为什么。

I'm not too familiar with gevent though I can't imagine what more complexity you might need or why.

我的意思是,如果你试图有一个重大的世界网站的效率,那么我建议构建C ++应用程序做你的CPU密集型的工作,然后用瓶,西芹或池运行该进程。 (这是YouTube上所做的混合C ++和放大器时,蟒蛇)

I mean if you're attempting to have efficiency of a major world-website, then I'd recommend building C++ applications to do your CPU-intensive work, and then use Flask-celery or Pool to run that process. (this is what YouTube does when mixing C++ & Python)

这篇关于Python的异步和CPU密集型任务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆