pandas -具有条件公式的Groupby [英] Pandas - Groupby with conditional formula

查看:112
本文介绍了 pandas -具有条件公式的Groupby的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

   Survived  SibSp  Parch
0         0      1      0
1         1      1      0
2         1      0      0
3         1      1      0
4         0      0      1

鉴于上述数据框,有没有一种优雅的方式来实现 groupby 有条件?
我想根据以下条件将数据分为两组:

Given the above dataframe, is there an elegant way to groupby with a condition? I want to split the data into two groups based on the following conditions:

(df['SibSp'] > 0) | (df['Parch'] > 0) =   New Group -"Has Family"
 (df['SibSp'] == 0) & (df['Parch'] == 0) = New Group - "No Family"

然后这两个组的均值并最终得到这样的输出:

then take the means of both of these groups and end up with an output like this:

               SurvivedMean
 Has Family    Mean
 No Family     Mean

可以使用groupby来完成,还是必须使用上面的条件语句?

Can it be done using groupby or would I have to append a new column using the above conditional statement?

推荐答案

一种简单的分组方法是使用这两列的总和。如果它们中的任何一个为正,则结果将大于1。groupby接受任意数组,只要其长度与DataFrame的长度相同即可,因此您无需添加新列。

An easy way to group that is to use the sum of those two columns. If either of them is positive, the result will be greater than 1. And groupby accepts an arbitrary array as long as the length is the same as the DataFrame's length so you don't need to add a new column.

family = np.where((df['SibSp'] + df['Parch']) >= 1 , 'Has Family', 'No Family')
df.groupby(family)['Survived'].mean()
Out: 
Has Family    0.5
No Family     1.0
Name: Survived, dtype: float64

这篇关于 pandas -具有条件公式的Groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆