pandas 按groupby求和,但排除某些列 [英] Pandas sum by groupby, but exclude certain columns
问题描述
在 Pandas 数据帧上进行分组但从该分组中排除某些列的最佳方法是什么?例如我有以下数据框:
What is the best way to do a groupby on a Pandas dataframe, but exclude some columns from that groupby? e.g. I have the following dataframe:
Code Country Item_Code Item Ele_Code Unit Y1961 Y1962 Y1963
2 Afghanistan 15 Wheat 5312 Ha 10 20 30
2 Afghanistan 25 Maize 5312 Ha 10 20 30
4 Angola 15 Wheat 7312 Ha 30 40 50
4 Angola 25 Maize 7312 Ha 30 40 50
我想对 Country 和 Item_Code 列进行分组,并且只计算 Y1961、Y1962 和 Y1963 列下的行的总和.生成的数据框应如下所示:
I want to groupby the column Country and Item_Code and only compute the sum of the rows falling under the columns Y1961, Y1962 and Y1963. The resulting dataframe should look like this:
Code Country Item_Code Item Ele_Code Unit Y1961 Y1962 Y1963
2 Afghanistan 15 C3 5312 Ha 20 40 60
4 Angola 25 C4 7312 Ha 60 80 100
现在我正在这样做:
df.groupby('Country').sum()
然而,这也会将 Item_Code 列中的值相加.有什么方法可以指定哪些列要包含在 sum()
操作中,哪些要排除?
However this adds up the values in the Item_Code column as well. Is there any way I can specify which columns to include in the sum()
operation and which ones to exclude?
推荐答案
您可以选择 groupby 的列:
You can select the columns of a groupby:
In [11]: df.groupby(['Country', 'Item_Code'])[["Y1961", "Y1962", "Y1963"]].sum()
Out[11]:
Y1961 Y1962 Y1963
Country Item_Code
Afghanistan 15 10 20 30
25 10 20 30
Angola 15 30 40 50
25 30 40 50
请注意,传递的列表必须是列的子集,否则您会看到 KeyError.
这篇关于 pandas 按groupby求和,但排除某些列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!