计算组平均值、总和或其他汇总统计数据.并将列分配给原始数据 [英] Calculate group mean, sum, or other summary stats. and assign column to original data
问题描述
我想计算 mean
(或任何其他长度为 1 的汇总统计数据,例如 min
、max
、length
, sum
) 在分组变量(group")的每个级别内的数值变量(value").
I want to calculate mean
(or any other summary statistics of length one, e.g. min
, max
, length
, sum
) of a numeric variable ("value") within each level of a grouping variable ("group").
汇总统计量应分配给与原始数据具有相同长度的新变量.也就是说,原始数据的每一行都应该有一个对应于当前组值的值——数据集不应该被折叠成每组一行.例如,考虑组 mean
:
The summary statistic should be assigned to a new variable which has the same length as the original data. That is, each row of the original data should have a value corresponding to the current group value - the data set should not be collapsed to one row per group. For example, consider group mean
:
之前
id group value
1 a 10
2 a 20
3 b 100
4 b 200
之后
id group value grp.mean.values
1 a 10 15
2 a 20 15
3 b 100 150
4 b 200 150
推荐答案
您可以在 dplyr
中使用 mutate
执行此操作:
You may do this in dplyr
using mutate
:
library(dplyr)
df %>%
group_by(group) %>%
mutate(grp.mean.values = mean(value))
...或使用 data.table
通过引用分配新列 (:=
):
...or use data.table
to assign the new column by reference (:=
):
library(data.table)
setDT(df)[ , grp.mean.values := mean(value), by = group]
这篇关于计算组平均值、总和或其他汇总统计数据.并将列分配给原始数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!