根据R中的聚合值生成表 [英] Generate table based on aggregate values in R
问题描述
我有一个以下格式的数据框,我想根据汇总值得到表:
I have a data frame in the following format and I want to get table based on aggregate value:
VALUE Time1 Time2
1 NN NF
2 FF FF
7 NF FF
4 NN NN
3 NN FF
3 NF NF
5 NF NF
6 FF FF
我可以使用 table()创建一个简单的表功能
table(Time1,Time2)
其中给出以下输出
FF FN NF NN Total
FF 2 0 0 0 2
FN 0 0 0 0 0
NF 1 0 2 0 3
Total 3 0 2 0 5
我希望根据 VALUE 的总和将上述数据框交叉列表/ strong>列。我可以使用 sumif 函数在excel中执行此操作,并获得以下输出。
I want the above data frame to be cross tabulated based on the sum of the VALUE column. I can do that in excel using the sumif function and get the following output.
FF FN NF NN Total
FF 8 0 0 0 8
FN 0 0 0 0 0
NF 7 0 8 0 15
NN 3 0 1 4 8
Total 18 0 9 4 31
我需要帮助吗?
推荐答案
对于 sum
的情况,您只需使用 xtabs
。在这里,我把它包裹在 addmargins
中以获得总计:
For cases of sum
you can just use xtabs
. Here, I've wrapped it in addmargins
to get the totals too:
addmargins(xtabs(VALUE ~ Time1 + Time2, mydf))
# Time2
# Time1 FF NF NN Sum
# FF 8 0 0 8
# NF 7 8 0 15
# NN 3 1 4 8
# Sum 18 9 4 31
更一般地,您可能需要从reshape2中查看 dcast
:
library(reshape2)
dcast(mydf, Time1 ~ Time2, value.var="VALUE", fun.aggregate=sum, margins=TRUE)
# Time1 FF NF NN (all)
# 1 FF 8 0 0 8
# 2 NF 7 8 0 15
# 3 NN 3 1 4 8
# 4 (all) 18 9 4 31
要解决SimonO101的问题如果数据被正确分解,则所有级别将默认显示为 xtabs
方法。但是,您需要使用 dcast
版本来指定 drop = FALSE
。
To address @SimonO101's question, if the data are correctly factored, then all levels will show by default with the xtabs
approach. However, you will need to specify drop = FALSE
with the dcast
version.
将上述数据(由于它不包含FN的Time1或Time2),所以我们因子
列,看看如何更改输出:
Taking the above data (which, as it is does not contain a "Time1" or "Time2" of "FN"), let's factor
both those columns and see how that changes the output:
mydf[-1] <- lapply(mydf[-1], function(x) factor(x, c("FF", "FN", "NF", "NN")))
addmargins(xtabs(VALUE ~ Time1 + Time2, mydf))
# Time2
# Time1 FF FN NF NN Sum
# FF 8 0 0 0 8
# FN 0 0 0 0 0
# NF 7 0 8 0 15
# NN 3 0 1 4 8
# Sum 18 0 9 4 31
如上所述, code> dcast 等价于:
As mentioned, the dcast
equivalent would be:
dcast(mydf, Time1 ~ Time2, value.var="VALUE",
fun.aggregate=sum, margins=TRUE, drop=FALSE)
这篇关于根据R中的聚合值生成表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!