如何从 Pandas 的 groupby().transform() 中排除单行? [英] How do I exclude a single row from groupby().transform() in pandas?

查看:51
本文介绍了如何从 Pandas 的 groupby().transform() 中排除单行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的预期目标是执行一个简单的groupby,然后使用.transform('mean') 将组平均值存储为一个新列.然后事情变得复杂了.问题是我真正想要的是 n-1 个值的平均值,其中n"是属于每个组的行数.示例数据,其中 RESULT 列是我想要的输出:

My intended goal was to perform a simple groupby, then store the group averages as a new column using .transform('mean'). Then things got complicated. The catch is that what I really want is an average of n-1 values, where 'n' is the number of rows belonging to each group. Example data, where the column RESULT is my desired output:

import pandas as pd

list_of_tuples = [('A', 3, 4.5),
                  ('A', 2, 4.75),
                  ('A', 5, 4),
                  ('A', 4, 4.25),
                  ('A', 7, 3.5),
                  ('B', 6, 6.75),
                  ('B', 9, 6),
                  ('B', 8, 6.25),
                  ('B', 4, 7.25),
                  ('B', 6, 6.75)]

df = pd.DataFrame.from_records(data=list_of_tuples, columns=['ID', 'VALUE', 'RESULT'])

>>> df
  ID  VALUE  RESULT
0  A      3    4.50
1  A      2    4.75
2  A      5    4.00
3  A      4    4.25
4  A      7    3.50
5  B      6    6.75
6  B      9    6.00
7  B      8    6.25
8  B      4    7.25
9  B      6    6.75

可以看到第一行中RESULT的值是[2, 5, 4, 7]的平均值,也就是4.5.同样,最后一行的 RESULT 值是 [6, 9, 8, 4] 的平均值,即 6.75.

You can see that in the first row the value of RESULT is the average of [2, 5, 4, 7], which is 4.5. Likewise, the value of RESULT for the last row is the average of [6, 9, 8, 4], which is 6.75.

因此对于每一行,RESULT 的值应该是 VALUE 排除 的组平均值(分组在 ID 上)em> VALUE 中该特定行的数字.

So for each row the value of RESULT should be the group average (grouping on ID) of VALUE excluding the number in VALUE for that particular row.

推荐答案

从我上面的评论中得到了答案.

Got the answer from my comment above.

list_of_tuples = [('A', 3, 4.5),
                  ('A', 2, 4.75),
                  ('A', 5, 4),
                  ('A', 4, 4.25),
                  ('A', 7, 3.5),
                  ('B', 6, 6.75),
                  ('B', 9, 6),
                  ('B', 8, 6.25),
                  ('B', 4, 7.25),
                  ('B', 6, 6.75)]

df = pd.DataFrame(list_of_tuples)

df.drop(2, axis = 1, inplace = True)

n = df.groupby(0)[1].transform('count')
m = df.groupby(0)[1].transform('mean')
df['result'] = (m*n - df[1])/(n-1)

df

这篇关于如何从 Pandas 的 groupby().transform() 中排除单行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆