使用Groupby时调用具有多个参数的函数 [英] Calling Functions with Multiple Arguments when using Groupby

查看:485
本文介绍了使用Groupby时调用具有多个参数的函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果函数具有多个参数,则在熊猫中编写要与groupby.apply或groupby.transform一起使用的函数时,则在作为groupby的一部分调用函数时,参数应使用逗号而不是括号.一个例子是:

When writing functions to be used with groupby.apply or groupby.transform in pandas if the functions have multiple arguments, then when calling the function as part of groupby the arguments follow a comma rather than in parentheses. An example would be:

def Transfunc(df, arg1, arg2, arg2):
     return something

GroupedData.transform(Transfunc, arg1, arg2, arg3)

df参数自动作为第一个参数传递.

Where the df argument is passed automatically as the first argument.

但是,使用函数对数据进行分组时,似乎不可能使用相同的语法.请看以下示例:

However, the same syntax does not seem to be possible when using a function to group the data. Take the following example:

people = DataFrame(np.random.randn(5, 5), columns=['a', 'b', 'c', 'd', 'e'], index=['Joe', 'Steve', 'Wes', 'Jim', 'Travis'])
people.ix[2:3, ['b', 'c']] = NA

def MeanPosition(Ind, df, Column):
    if df[Column][Ind] >= np.mean(df[Column]):
        return 'Greater Group'
    else:
        return 'Lesser Group'
# This function compares each data point in column 'a' to the mean of column 'a' and return a group name based on whether it is greater than or less than the mean

people.groupby(lambda x: MeanPosition(x, people, 'a')).mean()

上面的方法工作得很好,但是我不明白为什么我必须将函数包装在lambda中.根据与transform和apply一起使用的语法,在我看来,以下内容应该可以正常工作:

The above works just fine, but I can't understand why I have to wrap the function in a lambda. Based upon the syntax used with transform and apply it seems to me that the following should work just fine:

people.groupby(MeanPosition, people, 'a').mean()

任何人都可以告诉我为什么,或者如何在不将其包装到lambda中的情况下调用该函数吗?

Can anyone tell me why, or how I can call the function without wrapping it in a lambda?

谢谢

我认为不通过将函数作为键传递而不将函数包装在lambda中是不可能对数据进行分组的.一种可能的解决方法是,而不是将函数作为键传递,而传递由函数创建的数组.这将以以下方式工作:

I do not think it is possible to group the data by passing a function as the key without wrapping that function in a lambda. One possible workaround is to rather than passing a function as the key, pass an array that has been created by a function. This would work in the following manner:

def MeanPositionList(df, Column):
    return ['Greater Group' if df[Column][row] >= np.mean(df[Column]) else 'Lesser Group' for row in df.index]

Grouped = people.groupby(np.array(MeanPositionList(people, 'a')))
Grouped.mean()

但是,当然最好将中间人函数全部切掉,然后简单地使用具有列表理解力的数组....

But then of course it could be better just to cut out the middle man function all together and simply use an array with list comprhension....

推荐答案

将参数传递给apply恰好起作用,因为apply将所有参数传递给目标函数.

Passing arguments to apply just happens to work, because apply passes on all arguments to the target function.

但是,groupby具有多个参数,请参见

However, groupby takes multiple arguments, see here, so its not possible to differentiate between arguments; passing a lambda / named function is more explicit and the way to go.

这是我想做的事情(由于示例中有所有不同的组,因此略有修改)

Here is how to do what I think you want (slightly modified as you have all distinct groups in your example)

In [22]: def f(x):
   ....:     result = Series('Greater',index=x.index)
   ....:     result[x<x.mean()] = 'Lesser'
   ....:     return result
   ....: 

In [25]: df = DataFrame(np.random.randn(5, 5), columns=['a', 'b', 'c', 'd', 'e'], index=['Joe', 'Joe', 'Wes', 'Wes', 'Travis'])

In [26]: df
Out[26]: 
               a         b         c         d         e
Joe    -0.293926  1.006531  0.289749 -0.186993 -0.009843
Joe    -0.228721 -0.071503  0.293486  1.126972 -0.808444
Wes     0.022887 -1.813960  1.195457  0.216040  0.287745
Wes    -1.520738 -0.303487  0.484829  1.644879  1.253210
Travis -0.061281 -0.517140  0.504645 -1.844633  0.683103

In [27]: df.groupby(df.index.values).transform(f)
Out[27]: 
              a        b        c        d        e
Joe      Lesser  Greater   Lesser   Lesser  Greater
Joe     Greater   Lesser  Greater  Greater   Lesser
Travis  Greater  Greater  Greater  Greater  Greater
Wes     Greater   Lesser  Greater   Lesser   Lesser
Wes      Lesser  Greater   Lesser  Greater  Greater

这篇关于使用Groupby时调用具有多个参数的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆