为什么SQLAlchemy count()比原始查询慢得多? [英] Why is SQLAlchemy count() much slower than the raw query?

查看:462
本文介绍了为什么SQLAlchemy count()比原始查询慢得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将SQLAlchemy与MySQL数据库配合使用,我想对表中的行进行计数(大约300k). SQLAlchemy count 函数所花费的时间大约是其50倍直接在MySQL中编写相同的查询即可运行.我在做错什么吗?

I'm using SQLAlchemy with a MySQL database and I'd like to count the rows in a table (roughly 300k). The SQLAlchemy count function takes about 50 times as long to run as writing the same query directly in MySQL. Am I doing something wrong?

# this takes over 3 seconds to return
session.query(Segment).count()

但是:

SELECT COUNT(*) FROM segments;
+----------+
| COUNT(*) |
+----------+
|   281992 |
+----------+
1 row in set (0.07 sec)

速度差异随着表的大小而增加(在10万行以下几乎看不到).

The difference in speed increases with the size of the table (it is barely noticeable under 100k rows).

更新

使用session.query(Segment.id).count()而不是session.query(Segment).count()似乎可以解决问题,并使其达到最高速度.我仍然感到困惑,为什么初始查询会变慢.

Using session.query(Segment.id).count() instead of session.query(Segment).count() seems to do the trick and get it up to speed. I'm still puzzled why the initial query is slower though.

推荐答案

不幸的是,MySQL对子查询的支持非常糟糕,这给我们带来了非常不利的影响. SQLAlchemy文档指出可以使用query(func.count(Segment.id))来实现优化"查询:

Unfortunately MySQL has terrible, terrible support of subqueries and this is affecting us in a very negative way. The SQLAlchemy docs point out that the "optimized" query can be achieved using query(func.count(Segment.id)):

返回此查询将返回的行数.

Return a count of rows this Query would return.

这将为此查询生成SQL,如下所示:

This generates the SQL for this Query as follows:

SELECT count(1) AS count_1 FROM (
     SELECT <rest of query follows...> ) AS anon_1

要对要计数的特定列进行精细控制,请跳过 使用子查询或以其他方式控制FROM子句,或使用 其他聚合函数,请将func表达式与 query(),即:

For fine grained control over specific columns to count, to skip the usage of a subquery or otherwise control of the FROM clause, or to use other aggregate functions, use func expressions in conjunction with query(), i.e.:

from sqlalchemy import func

# count User records, without
# using a subquery.
session.query(func.count(User.id))

# return count of user "id" grouped
# by "name"
session.query(func.count(User.id)).\
        group_by(User.name)

from sqlalchemy import distinct

# count distinct "name" values
session.query(func.count(distinct(User.name)))

这篇关于为什么SQLAlchemy count()比原始查询慢得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆