python pandas groupby()结果 [英] python pandas groupby() result
问题描述
我有以下python熊猫数据框:
I have the following python pandas data frame:
df = pd.DataFrame( {
'A': [1,1,1,1,2,2,2,3,3,4,4,4],
'B': [5,5,6,7,5,6,6,7,7,6,7,7],
'C': [1,1,1,1,1,1,1,1,1,1,1,1]
} );
df
A B C
0 1 5 1
1 1 5 1
2 1 6 1
3 1 7 1
4 2 5 1
5 2 6 1
6 2 6 1
7 3 7 1
8 3 7 1
9 4 6 1
10 4 7 1
11 4 7 1
我想要另一列存储值对于固定的(都)A和B的C值的总和。就是这样的:
I would like to have another column storing a value of a sum over C values for fixed (both) A and B. That is, something like:
A B C D
0 1 5 1 2
1 1 5 1 2
2 1 6 1 1
3 1 7 1 1
4 2 5 1 1
5 2 6 1 2
6 2 6 1 2
7 3 7 1 2
8 3 7 1 2
9 4 6 1 1
10 4 7 1 2
11 4 7 1 2
我尝试过熊猫 groupby
它的作品:
I have tried with pandas groupby
and it kind of works:
res = {}
for a, group_by_A in df.groupby('A'):
group_by_B = group_by_A.groupby('B', as_index = False)
res[a] = group_by_B['C'].sum()
,但我不知道如何将结果从 res
转换为 df
以有序的方式。对此有任何建议会很高兴。谢谢。
but I don't know how to 'get' the results from res
into df
in the orderly fashion. Would be very happy with any advice on this. Thank you.
推荐答案
这里有一种方法(尽管它觉得这应该一次适用,我无法得到它)。
Here's one way (though it feels this should work in one go with an apply, I can't get it).
In [11]: g = df.groupby(['A', 'B'])
In [12]: df1 = df.set_index(['A', 'B'])
大小
groupby函数是你想要的,我们必须将它匹配到'A'和'B'作为索引:
The size
groupby function is the one you want, we have to match it to the 'A' and 'B' as the index:
In [13]: df1['D'] = g.size() # unfortunately this doesn't play nice with as_index=False
# Same would work with g['C'].sum()
In [14]: df1.reset_index()
Out[14]:
A B C D
0 1 5 1 2
1 1 5 1 2
2 1 6 1 1
3 1 7 1 1
4 2 5 1 1
5 2 6 1 2
6 2 6 1 2
7 3 7 1 2
8 3 7 1 2
9 4 6 1 1
10 4 7 1 2
11 4 7 1 2
这篇关于python pandas groupby()结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!