用数据框中组的平均值替换列值 [英] Replace a column values with its mean of groups in dataframe
问题描述
我有一个 DataFrame
I have a DataFrame as
Page Line y
1 2 3.2
1 2 6.1
1 3 7.1
2 4 8.5
2 4 9.1
我必须将 y 列替换为其分组平均值的值.我可以使用此代码使用一列进行分组.
I have to replace column y with values of its mean in groups. I can do that grouping using one column using this code.
df['y'] = df['y'].groupby(df['Page'], group_keys=False).transform('mean')
我正在尝试通过页面"和行"的组来替换 y 的值.像这样,
Page Line y
1 2 4.65
1 2 4.65
1 3 7.1
2 4 8.8
2 4 8.8
我在这个网站上搜索了很多答案,但找不到这个应用程序.将 python3 与 Pandas 结合使用.
I have searched through a lot of answers on this site but couldn't find this application. Using python3 with pandas.
推荐答案
您需要列名列表,groupby
参数by
:
by :映射、函数、标签或标签列表
by : mapping, function, label, or list of labels
用于确定 groupby 的组.如果 by 是一个函数,它会在对象索引的每个值上调用.如果传递了 dict 或 Series,则 Series 或 dict VALUES 将用于确定组(首先对齐 Series 的值;请参阅 .align() 方法).如果传递了 ndarray,则按原样使用这些值来确定组.标签或标签列表可以通过 self 中的列传递给 group.请注意,元组被解释为(单个)键.
Used to determine the groups for the groupby. If by is a function, it’s called on each value of the object’s index. If a dict or Series is passed, the Series or dict VALUES will be used to determine the groups (the Series’ values are first aligned; see .align() method). If an ndarray is passed, the values are used as-is determine the groups. A label or list of labels may be passed to group by the columns in self. Notice that a tuple is interpreted a (single) key.
df['y'] = df.groupby(['Page', 'Line'])['y'].transform('mean')
print (df)
Page Line y
0 1 2 4.65
1 1 2 4.65
2 1 3 7.10
3 2 4 8.80
4 2 4 8.80
您的解决方案应更改为此语法糖 - 通过列表中的系列:
Your solution should be changed to this syntactic sugar - pass Series in list:
df['y'] = df['y'].groupby([df['Page'], df['Line']]).transform('mean')
这篇关于用数据框中组的平均值替换列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!