将百分位数传递给 pandas agg函数 [英] Pass percentiles to pandas agg function

查看:82
本文介绍了将百分位数传递给 pandas agg函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过numpy percentile()函数通过熊猫的agg()函数,就像下面与其他各种numpy统计函数一样.

I want to pass the numpy percentile() function through pandas' agg() function as I do below with various other numpy statistics functions.

现在我有一个看起来像这样的数据框:

Right now I have a dataframe that looks like this:

AGGREGATE   MY_COLUMN
A           10
A           12
B           5
B           9
A           84
B           22

我的代码如下:

grouped = dataframe.groupby('AGGREGATE')
column = grouped['MY_COLUMN']
column.agg([np.sum, np.mean, np.std, np.median, np.var, np.min, np.max])

上面的代码有效,但我想做类似

The above code works, but I want to do something like

column.agg([np.sum, np.mean, np.percentile(50), np.percentile(95)])

即指定要从agg()返回的各种百分位数

i.e. specify various percentiles to return from agg()

这应该怎么做?

推荐答案

也许效率不是很高,但是一种方法是自己创建一个函数:

Perhaps not super efficient, but one way would be to create a function yourself:

def percentile(n):
    def percentile_(x):
        return np.percentile(x, n)
    percentile_.__name__ = 'percentile_%s' % n
    return percentile_

然后将其包含在您的agg中:

Then include this in your agg:

In [11]: column.agg([np.sum, np.mean, np.std, np.median,
                     np.var, np.min, np.max, percentile(50), percentile(95)])
Out[11]:
           sum       mean        std  median          var  amin  amax  percentile_50  percentile_95
AGGREGATE
A          106  35.333333  42.158431      12  1777.333333    10    84             12           76.8
B           36  12.000000   8.888194       9    79.000000     5    22             12           76.8

请注意,这是应该完成的方式...

Note sure this is how it should be done though...

这篇关于将百分位数传递给 pandas agg函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆