如何基于R中的因子创建子组摘要 [英] How to create summaries of subgroups based on factors in R

查看:111
本文介绍了如何基于R中的因子创建子组摘要的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在下面的示例中,我想计算每个数字变量的平均值.这些需要按照与"id"相关联的每个因素和与状态"相关联的每个因素进行分组.

I want to calculate the mean for each numeric variable in the following example. These need to be grouped by each factor associated with "id" and by each factor associated with"status".

set.seed(10)
dfex <- 
data.frame(id=c("2","1","1","1","3","2","3"),status=c("hit","miss","miss","hit","miss","miss","miss"),var3=rnorm(7),var4=rnorm(7),var5=rnorm(7),var6=rnorm(7))

对于"id"组而言,输出的第一行将标记为"mean-id-1".随后将标记为"mean-id-2"和"mean-id-3"的行.对于状态"组的方式,将行标记为平均状态未命中"和平均状态命中".我的目标是以编程方式生成这些均值及其行标签.

For the means of "id" groups, the first row of output would be labeled "mean-id-1". Rows labeled "mean-id-2" and "mean-id-3" would follow. For the means of "status" groups, the rows would be labeled "mean-status-miss" and "mean-status-hit". My objective is to generate these means and their row labels programatically.

我尝试了应用函数的许多不同排列,但是每个都有问题.我还尝试了聚合函数.

I've tried many different permutations of apply functions, but each has issues. I've also experimented with the aggregate function.

推荐答案

在基数为R的情况下,"id"列的工作原理如下:

With base R the following works for the "id" column:

means_id <- aggregate(dfex[,grep("var",names(dfex))],list(dfex$id),mean)
rownames(means_id) <- paste0("mean-id-",means_id$Group.1)
means_id$Group.1 <- NULL

输出:

                var3       var4       var5       var6
mean-id-1 -0.7182503 -0.2604572 -0.3535823 -1.3530417
mean-id-2  0.2042702 -0.3009548  0.6121843 -1.4364211
mean-id-3 -0.4567655  0.8716131  0.1646053 -0.6229102

状态"列相同:

means_status <- aggregate(dfex[,grep("var",names(dfex))],list(dfex$status),mean)
rownames(means_status) <- paste0("mean-status-",means_status$Group.1)
means_status$Group.1 <- NULL

这篇关于如何基于R中的因子创建子组摘要的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆