如何使 Tornado 中的 SQLAlchemy 异步? [英] How to make SQLAlchemy in Tornado to be async?

查看:93
本文介绍了如何使 Tornado 中的 SQLAlchemy 异步?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使 Tornado 中的 SQLAlchemy 成为 async ?我在 async mongo example 上找到了 MongoDB 的例子,但我找不到类似的东西motor 用于 SQLAlchemy.有谁知道如何使用 tornado.gen 执行 SQLAlchemy 查询(我在 SQLAlchemy 下面使用 MySQL,目前我的处理程序从数据库读取并返回结果,我想让它异步).

How to make SQLAlchemy in Tornado to be async ? I found example for MongoDB on async mongo example but I couldn't find anything like motor for SQLAlchemy. Does anyone know how to make SQLAlchemy queries to execute with tornado.gen ( I am using MySQL below SQLAlchemy, at the moment my handlers reads from database and return result, I would like to make this async).

推荐答案

ORM 不适合显式异步编程,也就是说,程序员必须在任何使用网络访问的情况发生时生成显式回调.一个主要原因是 ORM 广泛使用了延迟加载模式,即或多或少与显式异步不兼容.代码如下:

ORMs are poorly suited for explicit asynchronous programming, that is, where the programmer must produce explicit callbacks anytime something that uses network access occurs. A primary reason for this is that ORMs make extensive use of the lazy loading pattern, which is more or less incompatible with explicit async. Code that looks like this:

user = Session.query(User).first()
print user.addresses

实际上会发出两个单独的查询 - 当您说 first() 加载一行时,一个是当您说 user.addresses 时,在这种情况下.addresses 集合不存在,或者已经过期.从本质上讲,几乎每一行处理 ORM 结构的代码都可能会阻塞 IO,所以你会在几秒钟内陷入大量的回调意大利面——更糟糕的是,这些代码行中的绝大多数实际上不会 阻塞 IO,因此将回调连接在一起的所有开销都将是简单的属性访问操作,这也会使您的程序效率大大降低.

will actually emit two separate queries - one when you say first() to load a row, and the next when you say user.addresses, in the case that the .addresses collection isn't already present, or has been expired. Essentially, nearly every line of code that deals with ORM constructs might block on IO, so you'd be in extensive callback spaghetti within seconds - and to make matters worse, the vast majority of those code lines won't actually block on IO, so all the overhead of connecting callbacks together for what would otherwise be simple attribute access operations will make your program vastly less efficient too.

显式异步模型的一个主要问题是它们为复杂系统增加了巨大的 Python 函数调用开销 - 不仅在面向用户的方面(如延迟加载),而且在内部方面以及系统如何提供围绕 Python 数据库 API (DBAPI) 的抽象.对于 SQLAlchemy 来说,即使有基本的异步支持,也会对绝大多数不使用异步模式的程序,甚至那些非高度并发的异步程序造成严重的性能损失.考虑 SQLAlchemy 或任何其他 ORM 或抽象层,可能具有如下代码:

A major issue with explicit asynchronous models is that they add tremendous Python function call overhead to complex systems - not just on the user-facing side like you get with lazy loading, but on the internal side as well regarding how the system provides abstraction around the Python database API (DBAPI). For SQLAlchemy to even have basic async support would impose a severe performance penalty on the vast majority of programs that don't use async patterns, and even those async programs that are not highly concurrent. Consider SQLAlchemy, or any other ORM or abstraction layer, might have code like the following:

def execute(connection, statement):
     cursor = connection.cursor()
     cursor.execute(statement)
     results = cursor.fetchall()
     cursor.close()
     return results

上面的代码执行看似简单的操作,即在连接上执行 SQL 语句.但是使用像 psycopg2 的异步扩展这样的完全异步 DBAPI,上面的代码在 IO 上至少阻塞了 3 次.因此,以显式异步风格编写上述代码,即使没有使用异步引擎并且回调实际上没有阻塞,也意味着上述外部函数调用至少变为三个函数调用,而不是一个,不包括施加的开销通过显式异步系统或 DBAPI 调用自身.因此,围绕语句执行的简单抽象,一个简单的应用程序会自动受到 3 倍的函数调用开销的惩罚.而在 Python 中,函数调用开销就是一切.

The above code performs what seems to be a simple operation, executing a SQL statement on a connection. But using a fully async DBAPI like psycopg2's async extension, the above code blocks on IO at least three times. So to write the above code in explicit async style, even when there's no async engine in use and the callbacks aren't actually blocking, means the above outer function call becomes at least three function calls, instead of one, not including the overhead imposed by the explicit asynchronous system or the DBAPI calls themselves. So a simple application is automatically given a penalty of 3x the function call overhead surrounding a simple abstraction around statement execution. And in Python, function call overhead is everything.

由于这些原因,我对围绕显式异步系统的炒作仍然不那么兴奋,至少在某种程度上,有些人似乎希望对所有事情都采用异步方式,例如交付网页(请参阅 node.js).我建议改用隐式异步系统,最值得注意的是 gevent,在那里您可以获得所有非阻塞 IO 的好处异步模型,并且没有显式回调的结构冗长/缺点.我继续尝试理解这两种方法的用例,所以我对显式异步方法作为所有问题的解决方案的吸引力感到困惑,即正如您在 node.js 中看到的那样 - 我们在第一个减少冗长和代码复杂性的地方,对于像交付网页这样的简单事情的显式异步似乎什么都不做,只是添加可以由 gevent 或类似工具自动化的样板,如果阻塞 IO 在一个这样的问题中甚至是这样的问题像这样的情况(很多大容量网站都可以使用同步 IO 模型).基于 Gevent 的系统已在生产环境中得到验证,并且它们的受欢迎程度也在不断提高,因此如果您喜欢 ORM 提供的代码自动化,您可能还希望采用 gevent 等系统提供的异步 IO 调度自动化.

For these reasons, I continue to be less than excited about the hype surrounding explicit async systems, at least to the degree that some folks seem to want to go all async for everything, like delivering web pages (see node.js). I'd recommend using implicit async systems instead, most notably gevent, where you get all the non-blocking IO benefits of an asynchronous model and none of the structural verbosity/downsides of explicit callbacks. I continue to try to understand use cases for these two approaches, so I'm puzzled by the appeal of the explicit async approach as a solution to all problems, i.e. as you see with node.js - we're using scripting languages in the first place to cut down on verbosity and code complexity, and explicit async for simple things like delivering web pages seems to do nothing but add boilerplate that can just as well be automated by gevent or similar, if blocking IO is even such a problem in a case like that (plenty of high volume websites do fine with a synchronous IO model). Gevent-based systems are production proven and their popularity is growing, so if you like the code automation that ORMs provide, you might also want to embrace the async-IO-scheduling automation that a system like gevent provides.

更新:Nick Coghlan 指出了他的关于显式与隐式异步的主题的好文章,这里也是必读的.而且我还了解到 pep-3156 现在欢迎与 gevent 的互操作性,扭转了之前声明的对 gevent 的不感兴趣,这主要归功于 Nick 的文章.因此,将来,一旦集成这些方法的系统可用,我将推荐使用 gevent 作为数据库逻辑的 Tornado 的混合体.

Update: Nick Coghlan pointed out his great article on the subject of explicit vs. implicit async which is also a must read here. And I've also been updated to the fact that pep-3156 now welcomes interoperability with gevent, reversing its previously stated disinterest in gevent, largely thanks to Nick's article. So in the future I would recommend a hybrid of Tornado using gevent for the database logic, once the system of integrating these approaches is available.

这篇关于如何使 Tornado 中的 SQLAlchemy 异步?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆