Django在算术运算后聚合多个列 [英] Django aggregate multiple columns after arithmetic operation

查看:106
本文介绍了Django在算术运算后聚合多个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常奇怪的问题,Django 1.4.4。



我有这个模型:

 类LogQuarter(models.Model):
timestamp = models.DateTimeField()
domain = models.CharField(max_length = 253)
attempts = models。 IntegerField()
success = models.IntegerField()
queue = models.IntegerField()
...

我需要收集具有较高发送属性的前20个域。发送的财产是尝试 - 队列。



这是我的要求:

  obj = LogQuarter.objects\ 
.aggregate(Sum(F('attempts') - F('queue')))\
.values('domain')\
.filter(** kwargs)\
.order_by(' - sent')[:20]

我尝试了额外的东西,它不工作。



这是非常基本的SQL,我很惊讶,Django无法做到这一点。 / p>

有人有解决方案吗?

解决方案

你可以做到这一点通过子类化一些聚合功能。这需要挖掘代码才能真正理解,但这是我编写的类似于 MAX MIN 。 (注意:这段代码基于Django 1.4 / MySQL)。



首先将底层聚合类子类化并覆盖as_sql方法。此方法将实际的SQL写入数据库查询。我们必须确保引用正确传递的字段,并将其与正确的表名相关联。

  from django。 db.models.sql import aggregates 
class SqlCalculatedSum(aggregates.Aggregate):
sql_function ='SUM'
sql_template ='%(function)s(%(field)s - %(other_field )

def as_sql(self,qn,connection):
#self.col当前是一个元组,其中第一个项目是表名,
#第二项是主列名。假设我们的计算是同一个表中的两个字段上的
#,我们可以使用它来我们的优势。 qn是
#底层的数据库引用对象并引用相当的方式。 self.extra var中的列
#条目是
#secondary列的实际数据库列名。
self.extra ['other_field'] ='。'。join(
[cn(c)for(self.col [0],self.extra ['column'])])
return super(SqlCalculatedSum,self).as_sql(qn,connection)

接下来,子类一般模型聚合类并覆盖add_to_query方法。这个方法决定了如何将聚合添加到底层查询对象。我们希望能够传入字段名称(例如 queue ),但可以获得相应的DB列名称(如果它不同)。

 从django.db导入模型
class CalculatedSum(models.Aggregate):
name = SqlCalculatedSum

def add_to_query(self,query,alias,col,source,is_summary):
#使用self.extra设置为在初始化时传递给
#的所有额外kwarg的事实。我们希望得到相应的数据库列
#name,我们传递给变量kwarg的任何字段。
self.extra ['column'] = query.model._meta.get_field(
self.extra ['variable'])。db_column
query.aggregates [alias] = self.name (
col,source = source,is_summary = is_summary,** self.extra)

然后,您可以使用如下注释中的新类:

  queryset.annotate(calc_attempts = CalculatedSum('attempts',variable ='queue'))

假设您的尝试队列字段具有相同的数据库列名称,这应该生成类似于以下内容的SQL:

  SELECT SUM(`LogQuarter`.`attempts`  - `LogQuarter`.`queue`)AS calc_attempts 

你去了。


I have a really strange problem with Django 1.4.4.

I have this model :

class LogQuarter(models.Model):
  timestamp = models.DateTimeField()
  domain = models.CharField(max_length=253)
  attempts = models.IntegerField()
  success = models.IntegerField()
  queue = models.IntegerField()
  ...

I need to gather the first 20 domains with the higher sent property. The sent property is attempts - queue.

This is my request:

obj = LogQuarter.objects\
      .aggregate(Sum(F('attempts')-F('queue')))\
      .values('domain')\
      .filter(**kwargs)\
      .order_by('-sent')[:20]

I tried with extra too and it isn't working.

It's really basic SQL, I am surprised that Django can't do this.

Did someone has a solution ?

解决方案

You can actually do this via subclassing some of the aggregation functionality. This requires digging in to the code to really understand, but here's what I coded up to do something similar for MAX and MIN. (Note: this code is based of Django 1.4 / MySQL).

Start by subclassing the underlying aggregation class and overriding the as_sql method. This method writes the actual SQL to the database query. We have to make sure to quote the field that gets passed in correctly and associate it with the proper table name.

from django.db.models.sql import aggregates
class SqlCalculatedSum(aggregates.Aggregate):
  sql_function = 'SUM'
  sql_template = '%(function)s(%(field)s - %(other_field)s)'

  def as_sql(self, qn, connection):
    # self.col is currently a tuple, where the first item is the table name and
    # the second item is the primary column name. Assuming our calculation is
    # on two fields in the same table, we can use that to our advantage. qn is
    # underlying DB quoting object and quotes things appropriately. The column
    # entry in the self.extra var is the actual database column name for the
    # secondary column.
    self.extra['other_field'] = '.'.join(
        [qn(c) for c in (self.col[0], self.extra['column'])])
    return super(SqlCalculatedSum, self).as_sql(qn, connection)

Next, subclass the general model aggregation class and override the add_to_query method. This method is what determines how the aggregate gets added to the underlying query object. We want to be able to pass in the field name (e.g. queue) but get the corresponding DB column name (in case it is something different).

from django.db import models
class CalculatedSum(models.Aggregate):
  name = SqlCalculatedSum

  def add_to_query(self, query, alias, col, source, is_summary):
    # Utilize the fact that self.extra is set to all of the extra kwargs passed
    # in on initialization. We want to get the corresponding database column
    # name for whatever field we pass in to the "variable" kwarg.
    self.extra['column'] = query.model._meta.get_field(
        self.extra['variable']).db_column
    query.aggregates[alias] = self.name(
        col, source=source, is_summary=is_summary, **self.extra)

You can then use your new class in an annotation like this:

queryset.annotate(calc_attempts=CalculatedSum('attempts', variable='queue'))

Assuming your attempts and queue fields have those same db column names, this should generate SQL similar to the following:

SELECT SUM(`LogQuarter`.`attempts` - `LogQuarter`.`queue`) AS calc_attempts

And there you go.

这篇关于Django在算术运算后聚合多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆