pandas groupby/apply与int和string类型具有不同的行为 [英] Pandas groupby/apply has different behaviour with int and string types

查看:31
本文介绍了 pandas groupby/apply与int和string类型具有不同的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框

   X    Y
0  A   10
1  A    9
2  A    8
3  A    5
4  B  100
5  B   90
6  B   80
7  B   50

和两个非常相似的不同功能

and two different functions that are very similar

def func1(x):
    if x.iloc[0]['X'] == 'A':
        x['D'] = 1
    else:
        x['D'] = 0
    return x[['X', 'D']]

def func2(x):
    if x.iloc[0]['X'] == 'A':
        x['D'] = 'u'
    else:
        x['D'] = 'v'
    return x[['X', 'D']]

现在我可以对这些功能进行分组/应用

Now I can groupby/apply these functions

df.groupby('X').apply(func1)
df.groupby('X').apply(func2)

第一行给我我想要的东西,即

The first line gives me what I want, i.e.

   X  D
0  A  1
1  A  1
2  A  1
3  A  1
4  B  0
5  B  0
6  B  0
7  B  0

但是第二行返回的内容很奇怪

But the second line returns something quite strange

   X  D
0  A  u
1  A  u
2  A  u
3  A  u
4  A  u
5  A  u
6  A  u
7  A  u

所以我的问题是:

  • 有人可以解释为什么类型更改时groupby/apply的行为不同吗?
  • 我如何获得与func2类似的东西?
  • Can anybody explain why the behavior of groupby/apply is different when the type changes?
  • How can I get something similar with func2?

推荐答案

问题很简单,就是应用于GroupBy的函数应该从不尝试更改接收到的数据帧.它是副本(可以安全地更改,但更改不会在原始数据帧中看到)或视图取决于实现.该选择由pandas优化器完成,作为用户,您应该知道它是禁止的.

The problem is simply that a function applied to a GroupBy should never try to change the dataframe it receives. It is implementation dependant whether it is a copy (that can safely be changed but changes will not be seen in original dataframe) or a view. The choice is done by pandas optimizer, and as a user, you should just know that it is forbidden.

正确的方法是强制复制:

The correct way is to force a copy:

def func2(x):
    x = x.copy()
    if x.iloc[0]['X'] == 'A':
        x['D'] = 'u'
    else:
        x['D'] = 'v'
    return x[['X', 'D']]

然后,df.groupby('X').apply(func2).reset_index(level=0, drop=True)给出预期结果:

   X  D
0  A  u
1  A  u
2  A  u
3  A  u
4  B  v
5  B  v
6  B  v
7  B  v

这篇关于 pandas groupby/apply与int和string类型具有不同的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆