用MySQL计算中位数 [英] Calculating the Median with Mysql
问题描述
我在计算值列表的中位数而不是平均值时遇到了麻烦.
我找到了这篇文章 使用MySQL计算中位数的简单方法
它引用了以下我不正确理解的查询.
SELECT x.val from data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*)+1)/2
如果我有一个time
列并且要计算中位数,那么x
和y
列指的是什么?
val
是您的时间列,x
和y
是对数据表的两个引用(您可以编写data AS x, data AS y
).>
为了避免两次计算总和,可以存储中间结果.
CREATE TEMPORARY TABLE average_user_total_time
(SELECT SUM(time) AS time_taken
FROM scores
WHERE created_at >= '2010-10-10'
and created_at <= '2010-11-11'
GROUP BY user_id);
然后,您可以计算命名表中这些值的中位数.
临时表无效一个>这里.您可以尝试使用具有"MEMORY"表类型的常规表.或者只是让您的子查询在查询中两次计算中值.除此之外,我没有看到其他解决方案.这并不意味着没有更好的方法,也许其他人会想到一个主意.
I'm having trouble with calculating the median of a list of values, not the average.
I found this article Simple way to calculate median with MySQL
It has a reference to the following query which I don't understand properly.
SELECT x.val from data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*)+1)/2
If I have a time
column and I want to calculate the median value, what do the x
and y
columns refer to?
val
is your time column, x
and y
are two references to the data table (you can write data AS x, data AS y
).
EDIT: To avoid computing your sums twice, you can store the intermediate results.
CREATE TEMPORARY TABLE average_user_total_time
(SELECT SUM(time) AS time_taken
FROM scores
WHERE created_at >= '2010-10-10'
and created_at <= '2010-11-11'
GROUP BY user_id);
Then you can compute median over these values which are in a named table.
EDIT: Temporary table won't work here. You could try using a regular table with "MEMORY" table type. Or just have your subquery that computes the values for the median twice in your query. Apart from this, I don't see another solution. This doesn't mean there isn't a better way, maybe somebody else will come with an idea.
这篇关于用MySQL计算中位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!