pandas 使用哪种方法进行百分位数显示? [英] Which method does pandas use for percentile?
问题描述
我试图了解熊猫的上下百分位数计算,并感到有些困惑.这是示例代码及其输出.
I was trying to understand lower/upper percentiles calculation in pandas and got a bit confused. Here is the sample code and output for it.
test = pd.Series([7, 15, 36, 39, 40, 41])
test.describe()
输出:
我只对25%,75%的百分位数感兴趣. 我想知道大熊猫使用哪种方法计算它们?
I am interested in only 25%, 75% percentiles. I wonder which method does pandas use to calculate them?
请参考 https://en.wikipedia.org/wiki/Quartile ,结果如下:
那么熊猫使用什么统计/数学方法来计算百分位数?
So what statistical/mathematical method does pandas uses to calculate percentile?
推荐答案
正如我在评论中提到的,我终于通过使用@cb>建议的quantile
函数尝试from pandas.core.algorithms import quantile
,弄清了它是如何工作的.
As I mentioned in the comments, I finally figured out how it works by trying from pandas.core.algorithms import quantile
using quantile
function as @Abdou suggested.
我不能仅通过键入来解释它,因此我只会在给定的示例中进行此操作,只有25%的示例和75%的示例进行此操作.这是简短的(也许很糟糕)解释:
I am not that good to explain it only by typing, therefore I will do it only on the given example for 25% and 75% for this example only. Here is the brief (maybe poor) explanation:
对于示例列表,[7, 15, 36, 39, 40, 41]
分位数采用以下方式:
For the example list [7, 15, 36, 39, 40, 41]
quantiles are following way:
7-> 0%
15-> 20%
36-> 40%
39-> 60%
40-> 80%
41-> 100%
由于我们要找到25%的百分位,所以它将在15到36之间,而且是20%+ 5%= 15 +(36-15)/4 = 15 + 5.25 = 20.25.
Since we want to find 25% percentile, it will be between 15 and 36, moreover, it is 20% + 5% = 15 + (36-15)/4 = 15 + 5.25 = 20.25.
(36-15)/4,因为15和36之间的距离为40%-20%= 20%,因此我们将其除以4得到5%.
(36-15)/4 is used, because the distance between 15 and 36 is 40% - 20% = 20%, so we divide it by 4 to get 5%.
我们找到75%的方法相同.
The same way we can find 75%.
60%+ 15%= 39 + 3 *(40-39)/4 = 39.75
60% + 15% = 39 + 3*(40-39)/4 = 39.75
就是这样.非常抱歉解释不清
That's it. I am really sorry for poor explanation
注意:感谢@shin评论中提到的更正.
NOTE: Thank you @shin for the correction mentioned in the comment.
这篇关于 pandas 使用哪种方法进行百分位数显示?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!