pandas ,计算每个MultiIndex子级别的总和 [英] Pandas, Computing total sum on each MultiIndex sublevel

查看:97
本文介绍了 pandas ,计算每个MultiIndex子级别的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想计算每个多索引子级别的总和.然后,将其保存在数据框中.

I would like to compute the total sum on each multi-index sublevel. And then, save it in the dataframe.

我当前的数据框如下:

                    values
    first second
    bar   one     0.106521
          two     1.964873
    baz   one     1.289683
          two    -0.696361
    foo   one    -0.309505
          two     2.890406
    qux   one    -0.758369
          two     1.302628

所需的结果是:

                    values
    first second
    bar   one     0.106521
          two     1.964873
          total   2.071394
    baz   one     1.289683
          two    -0.696361
          total   0.593322
    foo   one    -0.309505
          two     2.890406
          total   2.580901
    qux   one    -0.758369
          two     1.302628
          total   0.544259
    total one     0.328331
          two     5.461546
          total   5.789877

目前,我发现以下有效的实现方式.但我想知道是否有更好的选择.我需要最快的解决方案,因为在某些情况下,当我的数据帧变得巨大时,计算时间似乎会花很多时间.

Currently I found the folowing implementation that works. But I would like to know if there are better options. I need the fastest solution possible, because in some cases when my dataframes become huge, the computation time seems to take ages.

In [1]: arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
   ...:           ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
   ...: 

In [2]: tuples = list(zip(*arrays))

In [3]: index = MultiIndex.from_tuples(tuples, names=['first', 'second'])

In [4]: s = Series(randn(8), index=index)

In [5]: d = {'values': s}

In [6]: df = DataFrame(d)

In [7]: for col in df.index.names:
   .....:     df = df.unstack(col)
   .....:     df[('values', 'total')] = df.sum(axis=1)
   .....:     df = df.stack()
   .....:

推荐答案

不确定您是否仍在寻找答案吗?您可以尝试这样的操作,假设您当前的数据帧已分配给df:

Not sure if you are still looking for an answer to this - you could try something like this, assuming your current dataframe is assigned to df :

temp = df.pivot(index='first', columns='second', values='values')
temp['total'] = temp['one'] + temp['two']
temp.stack()

这篇关于 pandas ,计算每个MultiIndex子级别的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆