pandas ,计算每个MultiIndex子级别的总和 [英] Pandas, Computing total sum on each MultiIndex sublevel
本文介绍了 pandas ,计算每个MultiIndex子级别的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想计算每个多索引子级别的总和.然后,将其保存在数据框中.
I would like to compute the total sum on each multi-index sublevel. And then, save it in the dataframe.
我当前的数据框如下:
values
first second
bar one 0.106521
two 1.964873
baz one 1.289683
two -0.696361
foo one -0.309505
two 2.890406
qux one -0.758369
two 1.302628
所需的结果是:
values
first second
bar one 0.106521
two 1.964873
total 2.071394
baz one 1.289683
two -0.696361
total 0.593322
foo one -0.309505
two 2.890406
total 2.580901
qux one -0.758369
two 1.302628
total 0.544259
total one 0.328331
two 5.461546
total 5.789877
目前,我发现以下有效的实现方式.但我想知道是否有更好的选择.我需要最快的解决方案,因为在某些情况下,当我的数据帧变得巨大时,计算时间似乎会花很多时间.
Currently I found the folowing implementation that works. But I would like to know if there are better options. I need the fastest solution possible, because in some cases when my dataframes become huge, the computation time seems to take ages.
In [1]: arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
...: ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
...:
In [2]: tuples = list(zip(*arrays))
In [3]: index = MultiIndex.from_tuples(tuples, names=['first', 'second'])
In [4]: s = Series(randn(8), index=index)
In [5]: d = {'values': s}
In [6]: df = DataFrame(d)
In [7]: for col in df.index.names:
.....: df = df.unstack(col)
.....: df[('values', 'total')] = df.sum(axis=1)
.....: df = df.stack()
.....:
推荐答案
不确定您是否仍在寻找答案吗?您可以尝试这样的操作,假设您当前的数据帧已分配给df
:>
Not sure if you are still looking for an answer to this - you could try something like this, assuming your current dataframe is assigned to df
:
temp = df.pivot(index='first', columns='second', values='values')
temp['total'] = temp['one'] + temp['two']
temp.stack()
这篇关于 pandas ,计算每个MultiIndex子级别的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文