使用 SQL 的百分位数 [英] Percentile using SQL

查看:81
本文介绍了使用 SQL 的百分位数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据集中有 3 列:

I have 3 columns in my data set:

  1. 货币
  2. 新近度
  3. 频率

我想再创建 3 个列,如 M_P、R_Q、F_Q,其中包含使用 SQL 的货币、新近度和频率的每个值的百分位值.

I want to create 3 more columns like M_P, R_Q, F_Q containing the percentile value of each of the values of Monetary, Recency, and Frequency using SQL.

提前致谢.

Customer_ID    Frequency Recency    Monetary    R_Q     F_Q        M_Q
112                 1      39          7.05      0.398   0.789    0.85873
143                 1      23          0.1833    0.232  0.7895   0.1501
164                 1      52          0.416      0.508   0.789  0.295
123                 1      118          1.1        0.98   0.789  0.52

推荐答案

你要找的函数是ANSI标准函数ntile():

The function you are looking for is the ANSI standard function ntile():

select t.*,
       ntile(100) over (order by monetary) as percentile_monetary,
       ntile(100) over (order by recency) as percentile_recency,
       ntile(100) over (order by frequency) as percentile_frequency
from t;

这在大多数数据库中都可用.

This is available in most databases.

您可以使用 rank()count() 计算百分位数.根据您想要如何处理关系以及您想要 1-100 还是 0-100 的值,以下应该是一个很好的起点:

You can calculate the percentile using rank() and count(). Depending on how you want to handle ties and whether you want values from 1-100 or 0-100, the following should a good starting point:

select t.*,
       (1 + rank_monetary * 100.0 / cnt) as percentile_monetary,
       (1 + rank_recency * 100.0 / cnt) as percentile_recency,
       (1 + rank_frequency * 100.0 / cnt) as percentile_frequency
from (select t.*,
             count(*) over () as cnt,
             rank() over (order by monetary) - 1 as rank_monetary,
             rank() over (order by recency) - 1 as rank_recency,
             rank() over (order by frequency) - 1 as rank_frequency
      from t
     ) t;

这篇关于使用 SQL 的百分位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆