.apply如何在Pandas DataFrame.groupby上工作? [英] How does .apply work on a Pandas DataFrame.groupby?
问题描述
Count
League Result
EPL H 16
D 9
A 10
Champ H 67
D 15
A 57
H 87
La Liga D 35
A 40
我对不同联赛的足球结果进行了细分,并计算了结果发生的次数。
I have a breakdown of football results for different leagues and a count of how many times that result occurred.
我想看看主场胜利,平局,客场胜利占比赛总数的百分比。我在下面看到了解决方案:
I want to see the proportion of home wins, draws, away wins as a percentage of the total games played. I have seen a solution to this below:
df.groupby("League").apply(lambda g: (g/g.sum()*100)
乍一看,这是有道理的,但是 g
在这里?我以为是H,D或A计数,然后 g.sum()
求和所有H,D,A计数但是,如果 g
只是一个值,我们如何调用方法 g.sum()
?这里的 g
是什么?
At first glance, this made sense, but what exactly is g
here? I assumed it was the H, D or A count and then the g.sum()
summed all of the H,D,A counts grouped by each division. But, if g
is just a value, how are we calling the method g.sum()
? What exactly is g
here?
推荐答案
g
是一个DataFrame。由于您将'League'
分组,因此您会将DataFrame分为多个单独的块,其中包含'League'
。为了说明这一点,我们可以遍历GroupBy对象。
g
is a DataFrame. Since you group on 'League'
you will split the DataFrame up into separate chunks which contain the unique values of 'League'
. To illustrate this, we can iterate over the GroupBy object.
for idx, g in df.groupby('League'): # `idx` is the unique group key
print(g, '\n')
Count
League Result
Champ H 67
D 15
A 57
H 87
Count
League Result
EPL H 16
D 9
A 10
Count
League Result
La Liga D 35
A 40
应用
然后将您的函数分别应用于每个DataFrame。调用 g.sum()
将为您提供一个系列,该系列求和该组中的每一列。
The apply
then acts to apply your function to each of these DataFrame separately. Calling g.sum()
will give you a Series that sums each column within the group.
for idx, g in df.groupby('League'):
print(g.sum(), '\n')
Count 226
dtype: int64
Count 35
dtype: int64
Count 75
dtype: int64
这篇关于.apply如何在Pandas DataFrame.groupby上工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!