Django& Postgres-百分位数(中位数)和分组依据 [英] Django & Postgres - percentile (median) and group by
问题描述
我需要计算每个卖方ID的期间中位数(请参见下面的简单模型)。问题是我无法构造ORM查询。
I need to calculate period medians per seller ID (see simplyfied model below). The problem is I am unable to construct the ORM query.
模型
class MyModel:
period = models.IntegerField(null=True, default=None)
seller_ids = ArrayField(models.IntegerField(), default=list)
aux = JSONField(default=dict)
查询
queryset = (
MyModel.objects.filter(period=25)
.annotate(seller_id=Func(F("seller_ids"), function="unnest"))
.values("seller_id")
.annotate(
duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()),
median=Func(
F("duration"),
function="percentile_cont",
template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
),
)
.values("median", "seller_id")
)
我认为我需要做的是以下几点
I think what I need to do is something along the lines below
select t.*, p_25, p_75
from t join
(select district,
percentile_cont(0.25) within group (order by sales) as p_25,
percentile_cont(0.75) within group (order by sales) as p_75
from t
group by district
) td
on t.district = td.district
Python 3.7.5,Django 2.2.8 ,Postgres 11.1
Python 3.7.5, Django 2.2.8, Postgres 11.1
推荐答案
这就是诀窍。
from django.db.models import F, Func, IntegerField
from django.db.models.aggregates import Aggregate
queryset = (
MyModel.objects.filter(period=25)
.annotate(duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()))
.filter(duration__isnull=False)
.annotate(seller_id=Func(F("seller_ids"), function="unnest"))
.values("seller_id") # group by
.annotate(
median=Aggregate(
F("duration"),
function="percentile_cont",
template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
),
)
)
注意 中位数
注释使用 Aggregate
而不使用 Func
的问题。
另外, annotate()和filter()子句的顺序以及 annotate()和values()子句的顺序 非常重要!
Notice the median
annotation employs Aggregate
and not Func
as in the question.
Also, order of annotate() and filter() clauses as well as order of annotate() and values() clauses matters a lot!
顺便说一句,生成的SQL没有嵌套的select和join。
BTW the resulting SQL is without a nested select and join.
这篇关于Django& Postgres-百分位数(中位数)和分组依据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!