有没有更好的方法来计算中位数(不是平均值) [英] Is there a better way to calculate the median (not average)

查看:73
本文介绍了有没有更好的方法来计算中位数(不是平均值)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我具有以下表定义:

Suppose I have the following table definition:

CREATE TABLE x (i serial primary key, value integer not null);

我要计算值的MEDIAN (不是AVG)。中位数是一个将集合分为两个包含相同数量元素的子集的值。如果元素数为偶数,则中位数是最低细分中的最大值和最大细分中的最低值的平均值。 (有关更多详细信息,请参阅Wikipedia。)

I want to calculate the MEDIAN of value (not the AVG). The median is a value that divides the set in two subsets containing the same number of elements. If the number of elements is even, the median is the average of the biggest value in the lowest segment and the lowest value of the biggest segment. (See wikipedia for more details.)

这是我设法计算中位数的方法,但我想一定有更好的方法:

Here is how I manage to calculate the MEDIAN but I guess there must be a better way:

SELECT AVG(values_around_median) AS median
  FROM (
    SELECT
       DISTINCT(CASE WHEN FIRST_VALUE(above) OVER w2 THEN MIN(value) OVER w3 ELSE MAX(value) OVER w2 END)
        AS values_around_median
      FROM (
        SELECT LAST_VALUE(value) OVER w AS value,
               SUM(COUNT(*)) OVER w > (SELECT count(*)/2 FROM x) AS above
          FROM x
          GROUP BY value
          WINDOW w AS (ORDER BY value)
          ORDER BY value
        ) AS find_if_values_are_above_or_below_median
      WINDOW w2 AS (PARTITION BY above ORDER BY value DESC),
             w3 AS (PARTITION BY above ORDER BY value ASC)
    ) AS find_values_around_median

有什么想法吗?

推荐答案

确实有更简单的方法。在Postgres中,您可以定义自己的聚合函数。不久前,我将函数的中值以及模式和范围发布到了PostgreSQL片段库。

Indeed there IS an easier way. In Postgres you can define your own aggregate functions. I posted functions to do median as well as mode and range to the PostgreSQL snippets library a while back.

http://wiki.postgresql.org/wiki/Aggregate_Median

这篇关于有没有更好的方法来计算中位数(不是平均值)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆