在多索引 pandas 数据帧上对重复的行求和 [英] Sum duplicated rows on a multi-index pandas dataframe

查看：59 发布时间：2020/5/13 18:35:36 python pandas dataframe multi-index

本文介绍了在多索引 pandas 数据帧上对重复的行求和的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

你好，我在和熊猫打交道时遇到了麻烦.我正在尝试对多索引数据框上的重复行求和. 我尝试了df.groupby(level=[0,1]).sum()，也尝试了df.stack().reset_index().groupby(['year', 'product']).sum()和其他一些，但是我无法使其正常工作. 我还想添加给定年份的每种独特产品，如果未列出它们，则将它们的值设为0.

Hello I'm having troubles dealing with Pandas. I'm trying to sum duplicated rows on a multiindex Dataframe. I tryed with df.groupby(level=[0,1]).sum() , also with df.stack().reset_index().groupby(['year', 'product']).sum() and some others, but I cannot get it to work. I'd also like to add every unique product for each given year and give them a 0 value if they weren't listed.

示例:具有多索引和3种不同乘积(A，B，C)的数据框:

Example: dataframe with multi-index and 3 different products (A,B,C):

                  volume1    volume2
year   product
2010   A          10         12
       A          7          3
       B          7          7
2011   A          10         10
       B          7          6
       C          5          5

预期产量:如果给定年份有重复的产品，则将它们相加. 如果其中一种产品未列出一年，则我们将创建一个新行，该行全为0.

Expected output : if there are duplicated products for a given year then we sum them. If one of the products isnt listed for a year, we create a new row full of 0.

                  volume1     volume2
year   product
2010   A          17          15
       B          7           7
       C          0           0
2011   A          10          10
       B          7           6
       C          5           5

有什么主意吗?谢谢

推荐答案

将sum与 stack :

df = df.sum(level=[0,1]).unstack(fill_value=0).stack()
#same as
#df = df.groupby(level=[0,1]).sum().unstack(fill_value=0).stack()

与 reindex :

df = df.sum(level=[0,1])
#same as
#df = df.groupby(level=[0,1]).sum()
mux = pd.MultiIndex.from_product(df.index.levels, names = df.index.names)
df = df.reindex(mux, fill_value=0)

Alternative1，谢谢@Wen:

Alternative1, thanks @Wen:

df = df.sum(level=[0,1]).unstack().stack(dropna=False)

print (df)
              volume1  volume2
year product                  
2010 A             17       15
     B              7        7
     C              0        0
2011 A             10       10
     B              7        6
     C              5        5

这篇关于在多索引 pandas 数据帧上对重复的行求和的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在多索引 pandas 数据帧上对重复的行求和 [英] Sum duplicated rows on a multi-index pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在多索引 pandas 数据帧上对重复的行求和 [英] Sum duplicated rows on a multi-index pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭