在 pandas 中总结两个以上具有相同索引的数据框 [英] Summing up more than two dataframes with the same indexes in Pandas

查看：90 发布时间：2020/5/24 1:06:48 python pandas dataframe addition

本文介绍了在 pandas 中总结两个以上具有相同索引的数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在熊猫中添加具有相同索引的4个数据框的值.如果有两个数据帧df1和df2，我们可以这样写:

I want to add values of 4 Dataframes with the same indexes in Pandas. If there are two dataframes, df1 and df2, we may write:

df1.add(df2)

，并包含3个数据框:

df3.add(df2.add(df1))

我想知道在Python中是否还有更通用的方法.

I wonder if there is a more general way to do so in Python.

推荐答案

选项1
使用sum

sum([df1, df2, df3, df4])

选项2
使用reduce

from functools import reduce

reduce(pd.DataFrame.add, [df1, df2, df3, df4])

选项3
将pd.concat和pd.DataFrame.sum与level=1
一起使用这仅在数据帧索引具有单个级别的情况下有效.为了使其正常工作，我们必须变得更加可爱.我建议其他选择.

Option 3
Use pd.concat and pd.DataFrame.sum with level=1
This only works if there is a single level to the dataframe indices. We've have to get a little more cute to make it work. I recommend the other options.

pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

设置

df = pd.DataFrame([[1, -1], [complex(0, 1), complex(0, -1)]])
df1, df2, df3, df4 = [df] * 4

演示

sum([df1, df2, df3, df4])

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

from functools import reduce

reduce(pd.DataFrame.add, [df1, df2, df3, df4])

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

        0        1
0  (4+0j)  (-4+0j)
1      4j      -4j

定时

小数据

%timeit sum([df1, df2, df3, df4])
%timeit reduce(pd.DataFrame.add, [df1, df2, df3, df4])
%timeit pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

1000 loops, best of 3: 591 µs per loop
1000 loops, best of 3: 456 µs per loop
100 loops, best of 3: 3.61 ms per loop

大数据

df = pd.DataFrame([[1, -1], [complex(0, 1), complex(0, -1)]])
df = pd.concat([df] * 1000, ignore_index=True)
df = pd.concat([df] * 100, axis=1, ignore_index=True)
df1, df2, df3, df4 = [df] * 4

%timeit sum([df1, df2, df3, df4])
%timeit reduce(pd.DataFrame.add, [df1, df2, df3, df4])
%timeit pd.concat(dict(enumerate([df1, df2, df3, df4]))).sum(level=1)

100 loops, best of 3: 3.94 ms per loop
100 loops, best of 3: 2.9 ms per loop
1 loop, best of 3: 1min per loop

这篇关于在 pandas 中总结两个以上具有相同索引的数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在 pandas 中总结两个以上具有相同索引的数据框 [英] Summing up more than two dataframes with the same indexes in Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在 pandas 中总结两个以上具有相同索引的数据框 [英] Summing up more than two dataframes with the same indexes in Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭