pandas -数据框分组-如何获取多个列的总和 [英] Pandas - dataframe groupby - how to get sum of multiple columns
问题描述
这应该很简单,但是以某种方式我找不到有效的解决方案.
This should be an easy one, but somehow I couldn't find a solution that works.
我有一个熊猫数据框,看起来像这样:
I have a pandas dataframe which looks like this:
index col1 col2 col3 col4 col5
0 a c 1 2 f
1 a c 1 2 f
2 a d 1 2 f
3 b d 1 2 g
4 b e 1 2 g
5 b e 1 2 g
我想按col1和col2分组并获得col3和col4的sum()
.由于无法汇总数据,可以删除Col5
.
这是输出的外观.我对在结果数据帧中同时包含col3
和col4
感兴趣. col1
和col2
是否属于索引并不重要.
Here is how the output should look like. I am interested in having both col3
and col4
in the resulting dataframe. It doesn't really matter if col1
and col2
are part of the index or not.
index col1 col2 col3 col4
0 a c 2 4
1 a d 1 2
2 b d 1 2
3 b e 2 4
这是我尝试过的:
df_new = df.groupby(['col1', 'col2'])["col3", "col4"].sum()
但是,这只会返回col4
的汇总结果.
That however only returns the aggregated results of col4
.
我在这里迷路了.我发现的每个示例仅汇总一列,显然不会发生问题.
I am lost here. Every example I found only aggregates one column, where the issue obviously doesn't occur.
推荐答案
通过使用apply
df.groupby(['col1', 'col2'])["col3", "col4"].apply(lambda x : x.astype(int).sum())
Out[1257]:
col3 col4
col1 col2
a c 2 4
d 1 2
b d 1 2
e 2 4
如果要agg
df.groupby(['col1', 'col2']).agg({'col3':'sum','col4':'sum'})
这篇关于 pandas -数据框分组-如何获取多个列的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!