Django条件子查询聚合 [英] Django conditional Subquery aggregate

查看:82
本文介绍了Django条件子查询聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的模型结构的简化示例是

An simplified example of my model structure would be

class Corporation(models.Model):
    ...

class Division(models.Model):
    corporation = models.ForeignKey(Corporation)

class Department(models.Model):
    division = models.ForeignKey(Division)
    type = models.IntegerField()

现在,我想显示一个表,该表显示公司,其中一列将包含某种类型的部门数,例如 type = 10 .目前,这是通过 Corporation 模型上的帮助程序实现的,该帮助程序可以检索这些信息,例如

Now I want to display a table that display corporations where a column will contain the number of departments of a certain type, e.g. type=10. Currently, this is implemented with a helper on the Corporation model that retrieves those, e.g.

class Corporation(models.Model):
    ...
    def get_departments_type_10(self):
        return (
            Department.objects
            .filter(division__corporation=self, type=10)
            .count()
        )

这里的问题是,由于N + 1问题,这绝对会破坏性能.

The problem here is that this absolutely murders performance due to the N+1 problem.

我尝试使用 select_related prefetch_related annotate subquery 解决此问题,但是我无法获得我需要的结果.

I have tried to approach this problem with select_related, prefetch_related, annotate, and subquery, but I havn't been able to get the results I need.

理想情况下,查询集中的每个 Corporation 都应使用整数 type_10_count 进行注释,该整数应反映该类型部门的数量.

Ideally, each Corporation in the queryset should be annotated with an integer type_10_count which reflects the number of departments of that type.

我确定我可以在 .extra()中使用原始sql进行操作,但是文档宣布它将不推荐使用(我在Django 1.11上)

I'm sure I could do something with raw sql in .extra(), but the docs announce that it is going to be deprecated (I'm on Django 1.11)

原始sql解决方案示例

corps = Corporation.objects.raw("""
SELECT
*,
(
    SELECT COUNT(*)
    FROM foo_division div ON div.corporation_id = c.id
    JOIN foo_department dept ON dept.division_id = div.id
    WHERE dept.type = 10
) as type_10_count
FROM foo_corporation c
""")

推荐答案

我认为,通过 Subquery ,我们可以使用此代码获得与您提供的SQL类似的SQL

I think with Subquery we can get SQL similar to one you have provided, with this code

# Get amount of departments with GROUP BY division__corporation [1]
# .order_by() will remove any ordering so we won't get additional GROUP BY columns [2]
departments = Department.objects.filter(type=10).values(
    'division__corporation'
).annotate(count=Count('id')).order_by()

# Attach departments as Subquery to Corporation by Corporation.id.
# Departments are already grouped by division__corporation
# so .values('count') will always return single row with single column - count [3]
departments_subquery = departments.filter(division__corporation=OuterRef('id'))
corporations = Corporation.objects.annotate(
    departments_of_type_10=Subquery(
        departments_subquery.values('count'), output_field=IntegerField()
    )
)

生成的SQL是

SELECT "corporation"."id", ... (other fields) ...,
  (
    SELECT COUNT("division"."id") AS "count"
    FROM "department"
    INNER JOIN "division" ON ("department"."division_id" = "division"."id") 
    WHERE (
      "department"."type" = 10 AND
      "division"."corporation_id" = ("corporation"."id")
    ) GROUP BY "division"."corporation_id"
  ) AS "departments_of_type_10"
FROM "corporation"

这里有些担心的是,大表子查询的速度可能很慢.但是,数据库查询优化器足够聪明,可以将子查询提升为OUTER JOIN,至少我听说PostgreSQL做到了.

Some concerns here is that subquery can be slow with large tables. However, database query optimizers can be smart enough to promote subquery to OUTER JOIN, at least I've heard PostgreSQL does this.

1.使用.values和.annotate

2.order_by()问题

3.子查询

这篇关于Django条件子查询聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆