Django复杂注释 [英] Django complex annotation
问题描述
先决条件:
- 查询集必须返回
商品
s - 查询集必须返回唯一的对象
- 不得用于访问数据库的循环(意味着对N个对象进行N次查询以进行注释)
- Queryset must return
Article
s - Queryset must return unique objects
- Must not utilize for loops that hit the database (meaning N queries for N objects to annotate)
我的模型:
class Report(BaseModel):
ios_report = JSONField()
android_report = JSONField()
class Article(BaseModel):
internal_id = models.IntegerField(unique=True)
title = models.CharField(max_length=500)
short_title = models.CharField(max_length=500)
picture_url = models.URLField()
published_date = models.DateField()
clip_link = models.URLField()
reports = models.ManyToManyField(
"Report", through="ArticleInReport", related_name="articles"
)
class ArticleInReport(BaseModel):
article = models.ForeignKey("core.Article", on_delete=models.CASCADE, related_name='articleinreports')
report = models.ForeignKey("core.Report", on_delete=models.CASCADE, related_name='articleinreports')
ios_views = models.IntegerField()
android_views = models.IntegerField()
@property
def total_views(self):
return self.ios_views + self.android_views
一切始于 Report
对象它是按设定的时间间隔创建的。该报告包含有关文章及其各自视图的数据。 Report
将通过 ArticleInReport
与 Article
有关联,其中包含导入报告时文章
中的用户总数。
Everything starts with a Report
object that is created at set intervals. This report contains data about articles and their respective views. A Report
will have a relationship with an Article
through ArticleInReport
, which holds the total number of users in Article
at the time the report was imported.
在我看来,我需要显示以下信息:
In my view, I need to display the following information:
- 在过去30分钟内获得观看次数的所有文章。
- All articles that received views in the last 30 minutes.
- With each article annotated with the following information, and this is where I'm facing a problem:
如果存在,则 Article
对象在 last 报告中的观看次数
。如果不存在,则为0。
If present, the number of views the
Article
object had in the lastReport
. If not present, 0.
我的 views.py
文件:
my views.py
file:
reports_in_time_range = Report.objects.filter(created_date__range=[starting_range, right_now]).order_by('created_date')
last_report = reports_in_time_range.prefetch_related('articles').last()
unique_articles = Article.objects.filter(articleinreports__report__in=reports_in_time_range).distinct('id')
articles = Article.objects.filter(id__in=unique_articles).distinct('id').annotate(
total_views=Case(
When(id__in=last_report.articles.values_list('id', flat=True),
then=F('articleinreports__ios_views') + F('articleinreports__android_views')),
default=0, output_field=IntegerField(),
))
对我的思维过程的一些解释:首先,请让我只获得当时相关报告中出现的文章范围( filter(id__in = uniq ue_articles)
),仅返回不同的文章。接下来,如果文章的ID显示在上次报告的文章列表中(当然,通过 ArticleInReport
),请计算iOS视图+ Android视图 ArticleInReport
。
Some explanation for my thought process: first, get me only articles that appear in the relevant reports in the time range (filter(id__in=unique_articles)
), return only distinct articles. Next, if the article's ID appears in the last report's list of articles (through ArticleInReport
of course), calculate iOS views + Android views for that ArticleInReport
.
上面的注释适用于大多数 Article
s,但没有明显的原因却为他人惨败。我尝试了许多不同的方法,但似乎总是会得到错误的结果。
This above annotation is working for most Article
s, but failing miserably for others for no apparent reason. I've tried many different approaches but seem to always get the wrong results.
推荐答案
避免数据库命中非常重要,但不以这个价格。我认为您应该将查询分为两个或多个查询。拆分查询可以提高可读性,并可能提高性能(有时两个简单查询的运行速度比复杂查询快)。请记住,您具有dics,理解和itertools的全部功能,可以处理部分结果。
It's very important to avoid hits to database, but not at this price. In my opinion you should to split your query in two or more queries. Splitting the query you will improve in readability and also, may be, in performance (sometimes two simple queries runs faster than a complex one) Remember you have all the power of dics, comprehension and itertools to massage your partials results.
reports_in_time_range = ( Report
.objects
.filter(created_date__range=[starting_range, right_now])
.order_by('created_date'))
last_report = reports_in_time_range.prefetch_related('articles').last()
report_articles_ids = ( Article
.objects
.filter(articleinreports__report=last_report)
.values_list('id', flat=True)
.distinct())
report_articles = ( Article
.objects
.filter(id__in=report_articles_ids)
.annotate( total_views=Sum(
F('articleinreports__ios_views') +
F('articleinreports__android_views'),
output_field=IntegerField()
)))
other_articles = ( Article
.objects
.exclude(id__in=report_articles_ids)
.annotate( total_views=ExpressionWrapper(
Value(0),
output_field=IntegerField())
)))
articles = report_articles | other_articles
这篇关于Django复杂注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!