根据R中的聚合值生成表 [英] Generate table based on aggregate values in R

查看:190
本文介绍了根据R中的聚合值生成表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个以下格式的数据框,我想根据汇总值得到表:

I have a data frame in the following format and I want to get table based on aggregate value:

VALUE   Time1   Time2
   1    NN  NF
   2    FF  FF
   7    NF  FF
   4    NN  NN
   3    NN  FF
   3    NF  NF
   5    NF  NF
   6    FF  FF

我可以使用 table()创建一个简单的表功能

 table(Time1,Time2)

其中给出以下输出

     FF FN  NF  NN  Total
 FF  2  0   0   0    2
 FN  0  0   0   0    0
 NF  1  0   2   0    3

Total 3 0   2   0    5

我希望根据 VALUE 的总和将上述数据框交叉列表/ strong>列。我可以使用 sumif 函数在excel中执行此操作,并获得以下输出。

I want the above data frame to be cross tabulated based on the sum of the VALUE column. I can do that in excel using the sumif function and get the following output.

    FF  FN  NF  NN  Total
 FF 8   0   0   0   8
 FN 0   0   0   0   0
 NF 7   0   8   0   15
 NN 3   0   1   4   8
 Total  18  0   9   4   31

我需要帮助吗?

推荐答案

对于 sum 的情况,您只需使用 xtabs 。在这里,我把它包裹在 addmargins 中以获得总计:

For cases of sum you can just use xtabs. Here, I've wrapped it in addmargins to get the totals too:

addmargins(xtabs(VALUE ~ Time1 + Time2, mydf))
#      Time2
# Time1 FF NF NN Sum
#   FF   8  0  0   8
#   NF   7  8  0  15
#   NN   3  1  4   8
#   Sum 18  9  4  31






更一般地,您可能需要从reshape2中查看 dcast

library(reshape2)
dcast(mydf, Time1 ~ Time2, value.var="VALUE", fun.aggregate=sum, margins=TRUE)
#   Time1 FF NF NN (all)
# 1    FF  8  0  0     8
# 2    NF  7  8  0    15
# 3    NN  3  1  4     8
# 4 (all) 18  9  4    31






要解决SimonO101的问题如果数据被正确分解,则所有级别将默认显示为 xtabs 方法。但是,您需要使用 dcast 版本来指定 drop = FALSE


To address @SimonO101's question, if the data are correctly factored, then all levels will show by default with the xtabs approach. However, you will need to specify drop = FALSE with the dcast version.

将上述数据(由于它不包含FN的Time1或Time2),所以我们因子列,看看如何更改输出:

Taking the above data (which, as it is does not contain a "Time1" or "Time2" of "FN"), let's factor both those columns and see how that changes the output:

mydf[-1] <- lapply(mydf[-1], function(x) factor(x, c("FF", "FN", "NF", "NN")))
addmargins(xtabs(VALUE ~ Time1 + Time2, mydf))
#      Time2
# Time1 FF FN NF NN Sum
#   FF   8  0  0  0   8
#   FN   0  0  0  0   0
#   NF   7  0  8  0  15
#   NN   3  0  1  4   8
#   Sum 18  0  9  4  31

如上所述, code> dcast 等价于:

As mentioned, the dcast equivalent would be:

dcast(mydf, Time1 ~ Time2, value.var="VALUE", 
      fun.aggregate=sum, margins=TRUE, drop=FALSE)

这篇关于根据R中的聚合值生成表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆