pandas -具有条件公式的Groupby [英] Pandas - Groupby with conditional formula
问题描述
Survived SibSp Parch
0 0 1 0
1 1 1 0
2 1 0 0
3 1 1 0
4 0 0 1
鉴于上述数据框,有没有一种优雅的方式来实现 groupby
有条件?
我想根据以下条件将数据分为两组:
Given the above dataframe, is there an elegant way to groupby
with a condition?
I want to split the data into two groups based on the following conditions:
(df['SibSp'] > 0) | (df['Parch'] > 0) = New Group -"Has Family"
(df['SibSp'] == 0) & (df['Parch'] == 0) = New Group - "No Family"
然后这两个组的均值并最终得到这样的输出:
then take the means of both of these groups and end up with an output like this:
SurvivedMean
Has Family Mean
No Family Mean
可以使用groupby来完成,还是必须使用上面的条件语句?
Can it be done using groupby or would I have to append a new column using the above conditional statement?
推荐答案
一种简单的分组方法是使用这两列的总和。如果它们中的任何一个为正,则结果将大于1。groupby接受任意数组,只要其长度与DataFrame的长度相同即可,因此您无需添加新列。
An easy way to group that is to use the sum of those two columns. If either of them is positive, the result will be greater than 1. And groupby accepts an arbitrary array as long as the length is the same as the DataFrame's length so you don't need to add a new column.
family = np.where((df['SibSp'] + df['Parch']) >= 1 , 'Has Family', 'No Family')
df.groupby(family)['Survived'].mean()
Out:
Has Family 0.5
No Family 1.0
Name: Survived, dtype: float64
这篇关于 pandas -具有条件公式的Groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!