为什么MYSQL DB在Django models.DateTimeField平均时返回损坏的值? [英] Why does MYSQL DB return a corrupted value when averaging over a Django models.DateTimeField?
问题描述
我的Django模型看起来像这样:
$ b $来自django.db导入模型的$ $ $ $ $ $ $ $ $
myModel(models.Model):
my_string = models.CharField(max_length = 32,)
my_date = models.DateTimeField()
@staticmethod
def get_stats )
logger.info(myModel.objects.values('my_string')。annotate(
count = Count(my_string),
min = Min('my_date'),
max = Max('my_date'),
avg = Avg('my_date'),
)
)
当我运行 get_stats()
时,我得到以下日志行:
[2015-06-21 09:45:40] INFO [all_logs:96] [{'my_string':u'A','count':2, 'avg':20080507582679.5,'min':da (2009,8,2,11,33,53,tzinfo = UTC),'max':datetime.datetime(2009,2,13,5,20,6,tzinfo =< UTC) )}]
我遇到的问题是数据库返回的my_date字段的平均值是: 20080507582679.5
。仔细看看那个数字。这是一个无效的日期格式。
为什么数据库不会返回这两个日期的平均值的有效值?如果描述的方式失败,我如何获得该字段的实际平均值? Django DateTimeField未设置执行句法平均?
Q1:为什么数据库不返回有效值对于这两个日期的平均值?
A:返回的值是预期的,它是定义良好的MySQL行为。 p>
如果该值在数字上下文中使用,MySQL会自动将日期或时间值转换为数字 ,反之亦然。
MySQL参考手册: https://dev.mysql.com/doc/refman/5.5/en/date-and-time-types。 html
在MySQL中, AVG
聚合函数用于数字值。
在MySQL中, DATE
或 DATETIME
表达式可以是eval在数字上下文中。
作为一个简单的演示,在<$ c上执行数字 $ c> DATETIME 将datetime值隐式转换为数字。这个查询:
SELECT NOW(),NOW()+ 0
pre>
返回一个结果,如:
NOW()NOW )+0
------------------- -----------------------
2015-06-23 17:57:48 20150623175748.000000
请注意,为表达
NOW()+ 0
不是 aDATETIME
,这是一个号码。
当您指定一个
SUM()
或<$一个DATETIME
表达式的c $ c> AVG()函数,相当于转换DATETIME
进入一个数字,然后求和
或平均数字。
也就是说,从这个表达式返回
AVG(mydatetimecol )
相当于此表达式的返回值:AVG(mydatetimecol + 0)
Q2:如果描述的方式失败,我该如何获得该字段的实际平均值?
A2: / strong>一种方法是将datetime转换为可以准确平均的数值,然后将其转换为日期时间。
例如,您可以从某个固定时间点将datetime转换为表示秒数的数值,例如
TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date)
然后,您可以平均这些值,从固定时间点获得平均秒数。 (注意:请注意添加极大数量的行,具有非常大的值,并超出限制(最大数值),数字溢出问题。)
AVG(TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date))
要将该值转换为日期时间,请将该值作为秒数添加到固定时间点:
'2015-01-01'+ INTERVAL AVG(TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date))SECOND
(请注意,
DATEIME
值在MySQL会话的时区进行评估;因此有一些边缘案例,MySQL会话中的time_zone
变量的设置将对返回的值产生一定影响。)
MySQL还提供了一个UNIX_TIMESTAMP()
函数,它返回一个unix风格的整数值,从时代开始的秒数(1970年1月1日午夜UTC )。您可以使用它来更简洁地完成相同的操作:FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(t.my_date)))
请注意,这个最终的表达式真的在做同样的事情...将datetime值转换成几秒钟因为'1970-01-01 00:00:00'UTC,取数字平均值,然后将该平均秒数添加回1970-01-01UTC,最后将其转换回
DATETIME
值,表示在当前会话time_zone
。
Q3:Django DateTimeField是否未设置进行平均处理?
A: / strong>显然,Django的作者对于SQL表达式
AVG(datetime)
从数据库返回的值感到满意。 >I'm running a Django application on top of a MySQL (actually MariaDB) database.
My Django Model looks like this:
from django.db import models from django.db.models import Avg, Max, Min, Count class myModel(models.Model): my_string = models.CharField(max_length=32,) my_date = models.DateTimeField() @staticmethod def get_stats(): logger.info(myModel.objects.values('my_string').annotate( count=Count("my_string"), min=Min('my_date'), max=Max('my_date'), avg=Avg('my_date'), ) )
When I run
get_stats()
, I get the following log line:[2015-06-21 09:45:40] INFO [all_logs:96] [{'my_string': u'A', 'count': 2, 'avg': 20080507582679.5, 'min': datetime.datetime(2007, 8, 2, 11, 33, 53, tzinfo=<UTC>), 'max': datetime.datetime(2009, 2, 13, 5, 20, 6, tzinfo=<UTC>)}]
The problem I have with this is that the average of the my_date field returned by the database is:
20080507582679.5
. Look carefully at that number. It is an invalid date format.Why doesn't the database return a valid value for the average of these two dates? How do I get the actual average of this field if the way described fails? Is Django DateTimeField not setup to do handle averaging?
解决方案Q1: Why doesn't the database return a valid value for the average of these two dates?
A: The value returned is expected, it's well defined MySQL behavior.
MySQL automatically converts a date or time value to a number if the value is used in a numeric context and vice versa.
MySQL Reference Manual: https://dev.mysql.com/doc/refman/5.5/en/date-and-time-types.html
In MySQL, the
AVG
aggregate function operates on numeric values.In MySQL, a
DATE
orDATETIME
expression can be evaluated in a numeric context.As a simple demonstration, performing an numeric addition operation on a
DATETIME
implicitly converts the datetime value into a number. This query:SELECT NOW(), NOW()+0
returns a result like:
NOW() NOW()+0 ------------------- ----------------------- 2015-06-23 17:57:48 20150623175748.000000
Note that the value returned for the expression
NOW()+0
is not aDATETIME
, it's a number.When you specify a
SUM()
orAVG()
function on aDATETIME
expression, that's equivalent to converting theDATETIME
into a number, and then summing or averaging the number.That is, the return from this expression
AVG(mydatetimecol)
is equivalent to the return from this expression:AVG(mydatetimecol+0)
What is being "averaged" is a numeric value. And you have observed, the value returned is not a valid datetime; and even in cases where it happens to look like a valid datetime, it's likely not a value you would consider a true "average".
Q2: How do I get the actual average of this field if the way described fails?
A2: One way to do that is to convert the datetime into a numeric value that can be "accurately" averaged, and then convert that back into a datetime.
For example, you could convert the datetime into a numeric value representing a number of seconds from some fixed point in time, e.g.
TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date)
You could then "average" those values, to get an average number of seconds from a fixed point in time. (NOTE: beware of adding up an extremely large number of rows, with extremely large values, and exceeding the limit (maximum numeric value), numeric overflow issues.)
AVG(TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date))
To convert that back to a datetime, add that value as a number of seconds back to a the fixed point in time:
'2015-01-01' + INTERVAL AVG(TIMESTAMPDIFF(SECOND,'2015-01-01',t.my_date)) SECOND
(Note that the
DATEIME
values are evaluated in the timezone of the MySQL session; so there are edge cases where the setting of thetime_zone
variable in the MySQL session will have some influence on the value returned.)MySQL also provides a
UNIX_TIMESTAMP()
function which returns a unix-style integer value, number of seconds from the beginning of the era (midnight Jan. 1, 1970 UTC). You can use that to accomplish the same operation more concisely:FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(t.my_date)))
Note that this final expression is really doing the same thing... converting the datetime value into a number of seconds since '1970-01-01 00:00:00' UTC, taking a numeric average of that, and then adding that average number of seconds back to '1970-01-01' UTC, and finally converting that back to a
DATETIME
value, represented in the current sessiontime_zone
.
Q3: Is Django DateTimeField not setup to do handle averaging?
A: Apparently, the authors of Django are satisfied with the value returned from the database for a SQL expression
AVG(datetime)
.这篇关于为什么MYSQL DB在Django models.DateTimeField平均时返回损坏的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!