过滤Django ORM中的聚合 [英] Filtering the Aggregate in the Django ORM

查看:200
本文介绍了过滤Django ORM中的聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的功能:

  def post_count(self):
return self.thread_set .aggregate(num_posts = Count('post'))['num_posts']

我只想将他们的状态标记为活动的帖子。在Count函数之前是否有一个简单的方法来添加过滤器?



模型说明:

 $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $)$))))))))))))))))))))))))))))))))))))))))))))) 
ordering = models.IntegerField(max_length = 3,default = 0)

@property
def thread_count(self):
return self.thread_set.all() .count()

@property
def post_count(self):
return self.thread_set.aggregate(num_posts = Count('post'))['num_posts']

class Thread(models.Model):
user = models.ForeignKey(User)
category = models.ForeignKey(Category)
title = models.CharField(max_length = 100)
slug = models.SlugField(max_length = 100)
content = models.TextField()
created = models.DateTimeField(auto_now_add = True)
latest_activity = models。 DateTimeField(auto_now_add = True)

class Post(models.Model):
thread = models.ForeignKey(Thread)
parent = models.ForeignKey('Post',null = True,blank = True)
display_name = models.CharField(max_length = 100)
email = models.EmailField(db_index = True)
ip_address = models.IPAddressField(null = True,blank = True)
content = models .TextField()
status = models.CharField(choices = STATUS_CHOICES,max_length = 25,db_index = True,default ='approved')
created = models.DateTimeField()


解决方案

确定,现在问题包括模型定义,我向你提供工作,除非您的Django版本不支持我在这里使用的某些功能(在这种情况下请让我知道!):

  Post.objects.filter(thread__in = thread_set,status ='active')。aggregate(num_posts = Count('id'))

Django允许过滤器中的 __一个QuerySet来决定 IN 子句在SQL中的外观,所以如果你通过 thread__in = thread_set ,Django将过滤帖子,以便只有那些线程字段指向 id thread_set 保留为聚合呼叫。



使用仅一个数据库查询过滤帖子,其中包含 WHERE thread_id IN ... ,而不是每个线程一个查询可怕如果还有其他事情发生,这将是Django中的错误...



结果应该是最多两个查询来建立一个类别的postcount - 一个获取 thread_set ,另一个实际上是计数帖子。另一种选择是根据线程类别字段和发布状态字段,我不一定会希望是那么快。 <罢工>(我说'最多',因为我猜他们可以自动融合...虽然我不认为这会发生在当前的Django,不能检查ATM,对不起。)



编辑: Django的QuerySet API参考 __在过滤器中说:







IN


在给定的列表中。



示例:

  Entry.objects.filter(id__in = 1,3,4])

SQL等价物:

  SELECT ... WHERE id IN(1,3,4); 

您还可以使用查询器动态评估值列表,而不是提供文字值列表:

  inner_qs = Blog.objects.filter(name__contains ='Cheddar')
entries = Entry.objects.filter (blog__in = inner_qs)

此查询将被评估为子选择语句:

  SELECT ... WHERE blog.id IN(SELECT id FROM ... WHERE NAME LIKE'%Cheddar%')

上述代码片段也可以写成如下:

  inner_q = Blog.objects.filter(name__contains ='Cheddar')。values('pk')。query 
entries = Entry.objects.filter(blog__in = inner_q)

在Django 1.1中更改:在Django 1.0中,只有后一段代码才有效。 / p>

这个第二种形式的写入方式有点不太可读,不自然,因为它访问内部查询att贡献并需要一个ValuesQuerySet。如果您的代码不需要与Django 1.0兼容,请使用第一种格式,直接传递查询集。






所以,我猜Django 能够将单个查询传递给数据库,在这种情况下。如果db的查询分析器做得很好,效果可能非常接近最佳。 : - )


I have a function that looks like this:

def post_count(self):
        return self.thread_set.aggregate(num_posts=Count('post'))['num_posts']

I only want to count posts that have their status marked as 'active'. Is there an easy way to add a filter before the Count function?

Model Definitions:

class Category(models.Model):
    name = models.CharField(max_length=100)
    slug = models.SlugField(max_length=100, blank=True, primary_key=True)
    ordering = models.IntegerField(max_length=3, default=0)

    @property
    def thread_count(self):
        return self.thread_set.all().count()

    @property
    def post_count(self):
        return self.thread_set.aggregate(num_posts=Count('post'))['num_posts']

class Thread(models.Model):
    user = models.ForeignKey(User)
    category = models.ForeignKey(Category)
    title = models.CharField(max_length=100)
    slug = models.SlugField(max_length=100)
    content = models.TextField()
    created = models.DateTimeField(auto_now_add=True)
    latest_activity = models.DateTimeField(auto_now_add=True)

class Post(models.Model):
    thread = models.ForeignKey(Thread)
    parent = models.ForeignKey('Post', null=True, blank=True)
    display_name = models.CharField(max_length=100)
    email = models.EmailField(db_index=True)
    ip_address = models.IPAddressField(null=True, blank=True)
    content = models.TextField()
    status = models.CharField(choices=STATUS_CHOICES, max_length=25, db_index=True, default='approved')
    created = models.DateTimeField()

解决方案

OK, now that the question includes the model definitions, I submit to you that this should work, unless your version of Django doesn't support some feature I use here (in which case, please let me know!):

Post.objects.filter(thread__in=thread_set, status='active').aggregate(num_posts=Count('id'))

Django allows __in filters to take a QuerySet to decide what the IN clause should look like in SQL, so if you pass thread__in=thread_set, Django will filter the posts so that only those whose thread field points to one of the ids of the threads in your thread_set remain for the aggregate call to see.

This should filter the posts with just one db query with something like WHERE thread_id IN ... inside, rather than with one query per thread, which would indeed be horrid. If anything else happened, this would be a bug in Django...

The result should be at most two queries to establish a Category's postcount -- one to obtain thread_set and another one actually to count the posts. The alternative is to have a thread/post join to be filtered based on Thread's category field and Post's status field, which I wouldn't necessarily expect to be that much faster. (I say 'at most', because I guess they could be fused automatically... Though I don't think this would happen with current Django. Can't check ATM, sorry.)

EDIT: Django's QuerySet API reference says this on __in filters:


IN

In a given list.

Example:

Entry.objects.filter(id__in=[1, 3, 4])

SQL equivalent:

SELECT ... WHERE id IN (1, 3, 4);

You can also use a queryset to dynamically evaluate the list of values instead of providing a list of literal values:

inner_qs = Blog.objects.filter(name__contains='Cheddar')
entries = Entry.objects.filter(blog__in=inner_qs)

This queryset will be evaluated as subselect statement:

SELECT ... WHERE blog.id IN (SELECT id FROM ... WHERE NAME LIKE '%Cheddar%')

The above code fragment could also be written as follows:

inner_q = Blog.objects.filter(name__contains='Cheddar').values('pk').query
entries = Entry.objects.filter(blog__in=inner_q)

Changed in Django 1.1: In Django 1.0, only the latter piece of code is valid.

This second form is a bit less readable and unnatural to write, since it accesses the internal query attribute and requires a ValuesQuerySet. If your code doesn't require compatibility with Django 1.0, use the first form, passing in a queryset directly.


So, I guess Django is capable of passing a single query to the db in the case at issue here. If the db's query analyser does a good job, the effect might be very nearly optimal. :-)

这篇关于过滤Django ORM中的聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆