使用 SQL 的百分位数 [英] Percentile using SQL
本文介绍了使用 SQL 的百分位数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的数据集中有 3 列:
I have 3 columns in my data set:
- 货币
- 新近度
- 频率
我想再创建 3 个列,如 M_P、R_Q、F_Q,其中包含使用 SQL 的货币、新近度和频率的每个值的百分位值.
I want to create 3 more columns like M_P, R_Q, F_Q containing the percentile value of each of the values of Monetary, Recency, and Frequency using SQL.
提前致谢.
Customer_ID Frequency Recency Monetary R_Q F_Q M_Q
112 1 39 7.05 0.398 0.789 0.85873
143 1 23 0.1833 0.232 0.7895 0.1501
164 1 52 0.416 0.508 0.789 0.295
123 1 118 1.1 0.98 0.789 0.52
推荐答案
你要找的函数是ANSI标准函数ntile()
:
The function you are looking for is the ANSI standard function ntile()
:
select t.*,
ntile(100) over (order by monetary) as percentile_monetary,
ntile(100) over (order by recency) as percentile_recency,
ntile(100) over (order by frequency) as percentile_frequency
from t;
这在大多数数据库中都可用.
This is available in most databases.
您可以使用 rank()
和 count()
计算百分位数.根据您想要如何处理关系以及您想要 1-100 还是 0-100 的值,以下应该是一个很好的起点:
You can calculate the percentile using rank()
and count()
. Depending on how you want to handle ties and whether you want values from 1-100 or 0-100, the following should a good starting point:
select t.*,
(1 + rank_monetary * 100.0 / cnt) as percentile_monetary,
(1 + rank_recency * 100.0 / cnt) as percentile_recency,
(1 + rank_frequency * 100.0 / cnt) as percentile_frequency
from (select t.*,
count(*) over () as cnt,
rank() over (order by monetary) - 1 as rank_monetary,
rank() over (order by recency) - 1 as rank_recency,
rank() over (order by frequency) - 1 as rank_frequency
from t
) t;
这篇关于使用 SQL 的百分位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文