如何使用Django查询器中的条件注释Count [英] How to annotate Count with a condition in a Django queryset

查看:155
本文介绍了如何使用Django查询器中的条件注释Count的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Django ORM,可以像 queryset.objects.annotate(Count('queryset_objects',gte = VALUE))。抓住我的漂移?






以下是用于说明可能答案的简单示例:



在Django网站中,内容创作者提交文章,常规用户查看(即阅读)所述文章。文章可以发布(即可供所有人阅读)或草稿模式。描述这些要求的模型是:

 类文章(models.Model):
author = models.ForeignKey(User )
published = models.BooleanField(default = False)

class读者(models.Model):
reader = models.ForeignKey(User)
which_article = models ForeignKey(文章)
what_time = models.DateTimeField(auto_now_add = True)

我的问题是:如何获取所有发表的文章,按照过去30分钟的独特读者排序?即我想计算每个发表文章在过去半小时内有多少不同(独特的)视图,然后生成按这些不同视图排序的文章列表。






我尝试过:

  date = datetime.now() -  timedelta(minutes = 30)
articles = Article.objects.filter(published = True).extra(select = {
views:
SELECT COUNT(*)
FROM myapp_readership
JOIN myapp_article on myapp_readership.which_article_id = myapp_article.id
WHERE myapp_readership.reader_id = myapp_user.id
AND myapp_readership.what_time>%s%date,
}) .order_by( - views)

这引发了错误: 01(其中01是额外的datetime对象)。

解决方案

对于django> = 1.8



使用条件聚合

  from django.db.models import Count,Case,When,IntegerField 
Article.objects.annotate(
numviews = Count(Case (
当(readership__what_time__lt = treshold,then = 1),
output_field = CharField(),
))

pre>

说明:
通过您的文章的正常查询将用 numviews 字段。该字段将被构造为由Count包装的CASE / WHEN表达式,它将返回1用于读者匹配标准,并且 NULL 用于读者不符合条件。 Count将忽略null并仅计数值。



您将获得最近没有查看的文章的零,您可以使用 numviews 字段用于排序和过滤。



在PostgreSQL之后查询这个是:

  SELECT 
app_article。id,
app_article。author,
app_article。published,
COUNT(
CASE WHENapp_readership。what_time< 2015-11-18 11:04:00.000000 + 01:00 THEN 1
ELSE NULL END
)asnumviews
FROMapp_articleLEFT OUTER JOINapp_readership
ON(app_article。id=app_readership。which_article_id)
GROUP BYapp_articleid app_article。author,app_article。published

查询,我们可以添加到 Count 中的区别,并使我们的子句返回值时,我们要明确。

  from django.db.models import Count,Case,When,CharField,F 
Article.objects.annotate(
numviews = Count(Case(
When(readership__what_time__lt = treshold,then = F('readership__reader')),#它也可以是'readership__reader_id',没关系
output_field = CharField(),
),distinct = True)

这将产生:

  SELECT 
app_article。id,
app_article。author,
app_article。published,
COUNT (
DISTINCT CASE WHENapp_readership。what_time< 2015-11-18 11:04:00.000000 + 01:00 THENapp_readership。reader_id
ELSE NULL END
)asnumviews
FROMapp_articleLEFT OUTER JOIN app_artership,id,app_artership,id=app_readership,which_article_id)
GROUP BYapp_artershipid,app_article 发布



对于django< 1.8和PostgreSQL



您可以使用 raw 来执行由较新版本的django创建的SQL语句。显然,没有使用 raw (甚至使用 extra )查询该数据的简单和优化的方法有一些问题注册需要 JOIN 子句)。

  Articles.objects.raw(' 
app_article。id,'
'app_article。author,'
'app_article。published,'
' ('
'DISTINCT CASE WHENapp_readership。what_time< 2015-11-18 11:04:00.000000 + 01:00 THENapp_readership。reader_id'
'ELSE NULL END '
')asnumviews'
'FROMapp_articleLEFT OUTER JOINapp_readership'
'ON(app_article。id=app_readership。which_article_id )'
'GROUP BYapp_article。id,app_article。author,app_article。published')


Using Django ORM, can one do something like queryset.objects.annotate(Count('queryset_objects', gte=VALUE)). Catch my drift?


Here's a quick example to use for illustrating a possible answer:

In a Django website, content creators submit articles, and regular users view (i.e. read) the said articles. Articles can either be published (i.e. available for all to read), or in draft mode. The models depicting these requirements are:

class Article(models.Model):
    author = models.ForeignKey(User)
    published = models.BooleanField(default=False)

class Readership(models.Model):
    reader = models.ForeignKey(User)
    which_article = models.ForeignKey(Article)
    what_time = models.DateTimeField(auto_now_add=True)

My question is: How can I get all published articles, sorted by unique readership from the last 30 mins? I.e. I want to count how many distinct (unique) views each published article got in the last half an hour, and then produce a list of articles sorted by these distinct views.


I tried:

date = datetime.now()-timedelta(minutes=30)
articles = Article.objects.filter(published=True).extra(select = {
  "views" : """
  SELECT COUNT(*)
  FROM myapp_readership
    JOIN myapp_article on myapp_readership.which_article_id = myapp_article.id
  WHERE myapp_readership.reader_id = myapp_user.id
  AND myapp_readership.what_time > %s """ % date,
}).order_by("-views")

This sprang the error: syntax error at or near "01" (where "01" was the datetime object inside extra). It's not much to go on.

解决方案

For django >= 1.8

Use Conditional Aggregation:

from django.db.models import Count, Case, When, IntegerField
Article.objects.annotate(
    numviews=Count(Case(
        When(readership__what_time__lt=treshold, then=1),
        output_field=CharField(),
    ))
)

Explanation: normal query through your articles will be annotated with numviews field. That field will be constructed as a CASE/WHEN expression, wrapped by Count, that will return 1 for readership matching criteria and NULL for readership not matching criteria. Count will ignore nulls and count only values.

You will get zeros on articles that haven't been viewed recently and you can use that numviews field for sorting and filtering.

Query behind this for PostgreSQL will be:

SELECT
    "app_article"."id",
    "app_article"."author",
    "app_article"."published",
    COUNT(
        CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN 1
        ELSE NULL END
    ) as "numviews"
FROM "app_article" LEFT OUTER JOIN "app_readership"
    ON ("app_article"."id" = "app_readership"."which_article_id")
GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"

If we want to track only unique queries, we can add distinction into Count, and make our When clause to return value, we want to distinct on.

from django.db.models import Count, Case, When, CharField, F
Article.objects.annotate(
    numviews=Count(Case(
        When(readership__what_time__lt=treshold, then=F('readership__reader')), # it can be also `readership__reader_id`, it doesn't matter
        output_field=CharField(),
    ), distinct=True)
)

That will produce:

SELECT
    "app_article"."id",
    "app_article"."author",
    "app_article"."published",
    COUNT(
        DISTINCT CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN "app_readership"."reader_id"
        ELSE NULL END
    ) as "numviews"
FROM "app_article" LEFT OUTER JOIN "app_readership"
    ON ("app_article"."id" = "app_readership"."which_article_id")
GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"

For django < 1.8 and PostgreSQL

You can just use raw for executing SQL statement created by newer versions of django. Apparently there is no simple and optimized method for querying that data without using raw (even with extra there are some problems with injecting required JOIN clause).

Articles.objects.raw('SELECT'
    '    "app_article"."id",'
    '    "app_article"."author",'
    '    "app_article"."published",'
    '    COUNT('
    '        DISTINCT CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN "app_readership"."reader_id"'
    '        ELSE NULL END'
    '    ) as "numviews"'
    'FROM "app_article" LEFT OUTER JOIN "app_readership"'
    '    ON ("app_article"."id" = "app_readership"."which_article_id")'
    'GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"')

这篇关于如何使用Django查询器中的条件注释Count的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆