计算组均值(或其他汇总统计信息)并分配给原始数据 [英] Calculate group mean (or other summary stats) and assign to original data
问题描述
我想计算每个级别内的数字变量(值")的mean
(或长度为1的任何其他摘要统计信息,例如min
,max
,length
,sum
).分组变量(组").
I want to calculate mean
(or any other summary statistics of length one, e.g. min
, max
, length
, sum
) of a numeric variable ("value") within each level of a grouping variable ("group").
应将摘要统计信息分配给一个新的变量,该变量的长度应与原始数据的长度相同.也就是说,原始数据的每一行应具有与当前组值相对应的值-数据集应 not 折叠为每组一行.例如,考虑组mean
:
The summary statistic should be assigned to a new variable which has the same length as the original data. That is, each row of the original data should have a value corresponding to the current group value - the data set should not be collapsed to one row per group. For example, consider group mean
:
之前
id group value
1 a 10
2 a 20
3 b 100
4 b 200
之后
id group value grp.mean.values
1 a 10 15
2 a 20 15
3 b 100 150
4 b 200 150
推荐答案
您可以在dplyr
中使用mutate
进行此操作:
You may do this in dplyr
using mutate
:
library(dplyr)
df %>%
group_by(group) %>%
mutate(grp.mean.values = mean(value))
...或使用data.table
通过引用(:=
)分配新列:
...or use data.table
to assign the new column by reference (:=
):
library(data.table)
setDT(df)[ , grp.mean.values := mean(value), by = group]
这篇关于计算组均值(或其他汇总统计信息)并分配给原始数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!