来自 groupby 平均值的 Pandas 新列 [英] Pandas new column from groupby averages

查看：56 发布时间：2021/12/3 9:22:42 python pandas dataframe

本文介绍了来自 groupby 平均值的 Pandas 新列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据帧

<预><代码>>>>df = pd.DataFrame({'a':[1,1,1,2,2,2],... 'b':[10,20,20,10,20,20],...'结果':[100,200,300,400,500,600]})...>>>df结果0 1 10 1001 1 20 2002 1 20 3003 2 10 4004 2 20 5005 2 20 600

并希望创建一个新列，该列是 'a' 和 'b' 的相应值的平均结果.我可以通过 groupby 获得这些值:

<预><代码>>>>df.groupby(['a','b'])['result'].mean()乙1 10 10020 2502 10 40020 550名称:结果，数据类型:int64

但无法弄清楚如何将其转换为原始 DataFrame 中的新列.最终结果应该是这样的，

<预><代码>>>>dfa b 结果 avg_result0 1 10 100 1001 1 20 200 2502 1 20 300 2503 2 10 400 4004 2 20 500 5505 2 20 600 550

我可以通过循环 'a' 和 'b' 的组合来做到这一点，但对于较大的数据集，这会变得非常缓慢和笨拙.可能有一种更简单、更快捷的方法.

解决方案

你需要 转换:

df['avg_result'] = df.groupby(['a','b'])['result'].transform('mean')

这会为您生成一个正确索引的 groupby 列:

 a b 结果 avg_result0 1 10 100 1001 1 20 200 2502 1 20 300 2503 2 10 400 4004 2 20 500 5505 2 20 600 550

I have a DataFrame

>>> df = pd.DataFrame({'a':[1,1,1,2,2,2],
...                    'b':[10,20,20,10,20,20],
...                    'result':[100,200,300,400,500,600]})
... 
>>> df
   a   b  result
0  1  10     100
1  1  20     200
2  1  20     300
3  2  10     400
4  2  20     500
5  2  20     600

and want to create a new column that is the average result for the corresponding values for 'a' and 'b'. I can get those values with a groupby:

>>> df.groupby(['a','b'])['result'].mean()
a  b 
1  10    100
   20    250
2  10    400
   20    550
Name: result, dtype: int64

but can not figure out how to turn that into a new column in the original DataFrame. The final result should look like this,

>>> df
   a   b  result  avg_result
0  1  10     100         100
1  1  20     200         250
2  1  20     300         250
3  2  10     400         400
4  2  20     500         550
5  2  20     600         550

I could do this by looping through the combinations of 'a' and 'b' but that would get really slow and unwieldy for larger sets of data. There is probably a much simpler and faster way to go.

解决方案

You need transform:

df['avg_result'] = df.groupby(['a','b'])['result'].transform('mean')

This generates a correctly indexed column of the groupby values for you:

   a   b  result  avg_result
0  1  10     100         100
1  1  20     200         250
2  1  20     300         250
3  2  10     400         400
4  2  20     500         550
5  2  20     600         550

这篇关于来自 groupby 平均值的 Pandas 新列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

来自 groupby 平均值的 Pandas 新列 [英] Pandas new column from groupby averages

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

来自 groupby 平均值的 Pandas 新列 [英] Pandas new column from groupby averages

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭