使用dplyr对序列数据的简单表 [英] Simple Table with dplyr on Sequence Data
本文介绍了使用dplyr对序列数据的简单表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想用
I would like to make a simple table with
dplyr
和
summarise
但我不知道如何...(即使应该很简单)。
But I can't really figure out how ... (Even though it should be quite simple).
我有一个序列矩阵。
当我简单列表
I have a matrix of sequences. When I simply tabulate
table(dta)
我有我想要的结果。
dta
acquaintance alone child notnotnot nuclear
1 2 17 19 131
nuclear and acquaintance nuclear and acquaintance nuclear and acquaintance nuclear and acquaintance partner
1 1 1 35 2
我不知道如何使用总结来做同样的事情
有什么建议吗?
dta = structure(c("nuclear", "nuclear", "child", "child", "child",
"acquaintance", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "child", "child",
"child", "alone", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "child", "child", "child",
"child", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "child", "child", "child",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "child", "child",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"partner", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"partner", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and acquaintance",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and acquaintance",
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance",
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and acquaintance",
"nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear",
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance",
"nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear",
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance",
"nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear",
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance",
"nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear",
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance",
"nuclear", "nuclear", "child", "nuclear", "notnotnot", "nuclear",
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance",
"nuclear", "nuclear", "child", "alone", "notnotnot", "nuclear"
), .Dim = c(10L, 21L), .Dimnames = list(c("1", "2", "3", "4",
"5", "6", "7", "8", "9", "10"), c("12:10", "12:20", "12:30",
"12:40", "12:50", "13:00", "13:10", "13:20", "13:30", "13:40",
"13:50", "14:00", "14:10", "14:20", "14:30", "14:40", "14:50",
"15:00", "15:10", "15:20", "15:30")))
推荐答案
您只需将数据转换为 data.frame
即可使用 dplyr
那么你可以很容易地得到你想要的输出:
You just have to convert your data to a data.frame
to use dplyr
and then you can easily get your desired output:
require(dplyr)
# ungrouped
data_frame(var = c(dta)) %>%
group_by_("var") %>%
summarise(n())
## var n()
## 1 acquaintance 1
## 2 alone 2
## 3 child 17
## 4 notnotnot 19
## 5 nuclear 131
## 6 nuclear and acquaintance 1
## 7 nuclear and acquaintance 1
## 8 nuclear and acquaintance 1
## 9 nuclear and acquaintance 35
## 10 partner 2
如果要为每个列分别执行此操作,可以使用 tidyr
首先收集结果,然后再次传播。
If you want to do this for each column seperately, you can use tidyr
to first gather the result and then spread it again.
require(tidyr)
# grouped
dta %>%
as.data.frame %>%
gather %>%
group_by(key, value) %>%
summarise(N = n()) %>%
spread(key, N)
## value 12:10 12:20 12:30 12:40 12:50 13:00 13:10 13:20 13:30 13:40 13:50 14:00 14:10
## 1 acquaintance 1 NA NA NA NA NA NA NA NA NA NA NA NA
## 2 alone NA 1 NA NA NA NA NA NA NA NA NA NA NA
## 3 child 3 3 4 3 2 NA NA NA NA NA NA NA NA
## 4 notnotnot 1 1 1 1 1 1 1 1 1 1 1 NA NA
## 5 nuclear 3 3 3 4 5 7 7 7 7 7 7 7 7
## 6 nuclear and acquaintance NA NA NA NA NA NA NA NA NA NA NA NA NA
## 7 nuclear and acquaintance NA NA NA NA NA NA NA NA NA NA NA NA NA
## 8 nuclear and acquaintance NA NA NA NA NA NA NA NA NA NA NA NA NA
## 9 nuclear and acquaintance 2 2 2 2 2 2 2 2 2 2 2 2 2
## 10 partner NA NA NA NA NA NA NA NA NA NA NA 1 1
## Variables not shown: 14:20 (int), 14:30 (int), 14:40 (int), 14:50 (int), 15:00 (int), 15:10 (int), 15:20 (int),
## 15:30 (int)
这篇关于使用dplyr对序列数据的简单表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文