Django Window批注结合使用Unique子句 [英] Django Window annotation using combined with distinct clause

查看:61
本文介绍了Django Window批注结合使用Unique子句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在PostgreSQL数据库中存储了一个Django模型,该模型由不定期的计数值组成:

I have a Django model stored in a Postgres DB comprised of values of counts at irregular intervals:

WidgetCount
 - Time
 - Count

我正在尝试将窗口函数与Lag配合使用,以便为我提供前一行的值作为注释.我的问题是,当我尝试将其与某些截然不同的日期截断结合使用时,window函数将使用源行,而不是经过分组的行.

I'm trying to use a window function with Lag to give me a previous row's values as an annotation. My problem is when I try to combine it with some distinct date truncation the window function uses the source rows rather than the distinctly grouped ones.

例如,如果我有以下行:

For example if I have the following rows:

time                count
2020-01-20 05:00    15
2020-01-20 06:00    20
2020-01-20 09:00    30
2020-01-21 06:00    35
2020-01-21 07:00    40
2020-01-22 04:00    50
2020-01-22 06:00    54
2020-01-22 09:00    58

我想返回一个查询集,显示每天的第一次阅读,我可以使用:

And I want to return a queryset showing the first reading per day, I can use:

from django.db.models.functions import Trunc

WidgetCount.objects.distinct("date").annotate(date=Trunc("time", "day"))

哪个给我:

date        count
01/01/20    15
01/01/21    35
01/01/22    50

我想添加一个注释,该注释为我提供昨天的值(以便我可以显示每天的更改).

I would like to add an annotation which gives me yesterday's value (so I can show the change per day).

date        count   yesterday_count
01/01/20    15
01/01/21    35      15
01/01/22    50      35

如果我这样做:

from django.db.models.functions import Trunc, Lag
from django.db.models import Window

WidgetCount.objects.distinct("date").annotate(date=Trunc("time", "day"), yesterday_count=Window(expression=Lag("count")))

第二行返回给我30的昨天计数-也就是说,在应用distingle子句之前,它向我显示了上一行.

The second row return gives me 30 for yesterday_count - ie, its showing me the previous row before applying the distinct clause.

如果我添加这样的partiion子句:

If I add a partiion clause like this:

WidgetCount.objects.distinct("date").annotate(date=Trunc("time", "day"), yesterday_count=Window(expression=Lag("count"), partition_by=F("date")))

然后所有行的昨天计数均为无.

Then yesterday_count is None for all rows.

如果需要,我可以使用Python进行此计算,但是这让我有点发疯,我想知道我想做的事情是否可能.

I can do this calculation in Python if I need to but it's driving me a bit mad and I'd like to find out if what I'm trying to do is possible.

谢谢!

推荐答案

我认为主要问题是,您正在将注释中使用的操作混合生成分组的查询集(例如sum)和简单地创建新字段的操作给定查询集中的每个记录,例如 yesterday_count = Window(expression = Lag("count")).

I think the main problem is that you're mixing operations that used in annotation generates a grouped query set such as sum with a operation that simples create a new field for each record in the given query set such as yesterday_count=Window(expression=Lag("count")).

因此订购在这里确实很重要.因此,当您尝试:

So Ordering really matters here. So when you try:

WidgetCount.objects.distinct("date").annotate(date=Trunc("time", "day"), yesterday_count=Window(expression=Lag("count")))

结果查询集只是带注释的WidgetCount.objects.distinct("date"),没有分组.

The result queryset is simply the WidgetCount.objects.distinct("date") annotated, no grouping is perfomed.

我建议解耦您的操作,以便更轻松地了解正在发生的事情,并注意到您正在遍历python对象,因此无需进行任何新查询!

I would suggest decoupling your operations so it becomes easier to understand what is happening, and notice you're iterating over the python object so don't need to make any new queries!

请注意使用SUM操作作为示例,因为FirstValue运算符出现意外错误.因此,我将与Sum一起发布以演示保持不变的想法.仅通过将 acc_count = Sum("count")更改为 first_count = FirstValue("count") ,对于第一个值的想法应该是相同的>

Note in using SUM operation as example because I am getting an unexpected error with the FirstValue operator. So I'm posting with Sum to demonstrate the idea which remains the same. The idea should be the same for first value just by changing acc_count=Sum("count") to first_count=FirstValue("count")

for truncDate_groups in Row.objects.annotate(trunc_date=Trunc('time','day')).values("trunc_date")\
                      .annotate(acc_count=Sum("count")).values("acc_count","trunc_date")\
                      .order_by('trunc_date')\
                      .annotate(y_count=Window(Lag("acc_count")))\
                      .values("trunc_date","acc_count","y_count"):
    print(truncDate_groups)

输出:

{'trunc_date': datetime.datetime(2020, 1, 20, 0, 0, tzinfo=<UTC>), 'acc_count': 65, 'y_count': None}
{'trunc_date': datetime.datetime(2020, 1, 21, 0, 0, tzinfo=<UTC>), 'acc_count': 75, 'y_count': 162}
{'trunc_date': datetime.datetime(2020, 1, 22, 0, 0, tzinfo=<UTC>), 'acc_count': 162, 'y_count': 65}

事实证明,FirstValue运算符要求使用Windows函数,这样您才能嵌套FirtValue并计算Lag,因此在这种情况下,我不确定是否可以执行此操作.问题在于如何在不嵌套窗口的情况下访问First_Value列.

It turns out FirstValue operator requires to use a Windows function so you can't nest FirtValue and then calculate Lag, so in this scenario I'm not exactly sure if you can do it. The question becomes how to access the First_Value column without nesting windows.

这篇关于Django Window批注结合使用Unique子句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆