Pandas groupby/apply 对 int 和 string 类型有不同的行为 [英] Pandas groupby/apply has different behaviour with int and string types

查看:30
本文介绍了Pandas groupby/apply 对 int 和 string 类型有不同的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框

   X    Y
0  A   10
1  A    9
2  A    8
3  A    5
4  B  100
5  B   90
6  B   80
7  B   50

和两个非常相似的不同功能

and two different functions that are very similar

def func1(x):
    if x.iloc[0]['X'] == 'A':
        x['D'] = 1
    else:
        x['D'] = 0
    return x[['X', 'D']]

def func2(x):
    if x.iloc[0]['X'] == 'A':
        x['D'] = 'u'
    else:
        x['D'] = 'v'
    return x[['X', 'D']]

现在我可以分组/应用这些功能

Now I can groupby/apply these functions

df.groupby('X').apply(func1)
df.groupby('X').apply(func2)

第一行给了我我想要的,即

The first line gives me what I want, i.e.

   X  D
0  A  1
1  A  1
2  A  1
3  A  1
4  B  0
5  B  0
6  B  0
7  B  0

但是第二行返回了一些很奇怪的东西

But the second line returns something quite strange

   X  D
0  A  u
1  A  u
2  A  u
3  A  u
4  A  u
5  A  u
6  A  u
7  A  u

所以我的问题是:

  • 谁能解释为什么 groupby/apply 的行为在类型改变时会有所不同?
  • 我怎样才能获得与 func2 类似的东西?
  • Can anybody explain why the behavior of groupby/apply is different when the type changes?
  • How can I get something similar with func2?

推荐答案

问题只是应用于 GroupBy 的函数应该永远尝试更改它接收的数据帧.它是副本(可以安全地更改,但不会在原始数据帧中看到更改)还是视图取决于实现.选择是由pandas优化器完成的,作为用户,你应该知道这是被禁止的.

The problem is simply that a function applied to a GroupBy should never try to change the dataframe it receives. It is implementation dependant whether it is a copy (that can safely be changed but changes will not be seen in original dataframe) or a view. The choice is done by pandas optimizer, and as a user, you should just know that it is forbidden.

正确的做法是强制复制:

The correct way is to force a copy:

def func2(x):
    x = x.copy()
    if x.iloc[0]['X'] == 'A':
        x['D'] = 'u'
    else:
        x['D'] = 'v'
    return x[['X', 'D']]

之后,df.groupby('X').apply(func2).reset_index(level=0, drop=True) 给出了预期:

   X  D
0  A  u
1  A  u
2  A  u
3  A  u
4  B  v
5  B  v
6  B  v
7  B  v

这篇关于Pandas groupby/apply 对 int 和 string 类型有不同的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆