使用dplyr对序列数据的简单表 [英] Simple Table with dplyr on Sequence Data

查看:139
本文介绍了使用dplyr对序列数据的简单表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用

I would like to make a simple table with

dplyr 

summarise

但我不知道如何...(即使应该很简单)。

But I can't really figure out how ... (Even though it should be quite simple).

我有一个序列矩阵。
当我简单列表

I have a matrix of sequences. When I simply tabulate

 table(dta) 

我有我想要的结果。

 dta
            acquaintance                        alone                        child                    notnotnot                      nuclear 
                       1                            2                           17                           19                          131 
 nuclear and     acquaintance  nuclear and    acquaintance    nuclear and  acquaintance     nuclear and acquaintance                      partner 
                       1                            1                            1                           35                            2 


我不知道如何使用总结来做同样的事情

有什么建议吗?

 dta =   structure(c("nuclear", "nuclear", "child", "child", "child", 
"acquaintance", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "child", "child", 
"child", "alone", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "child", "child", "child", 
"child", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "child", "child", "child", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "child", "child", 
 "nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
"partner", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear", 
"nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
 "partner", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
 "nuclear", "nuclear", "nuclear and acquaintance", "nuclear and  acquaintance", 
 "notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and     acquaintance", 
 "nuclear", "nuclear", "nuclear and acquaintance", "nuclear and acquaintance", 
 "notnotnot", "nuclear", "nuclear", "nuclear", "nuclear", "nuclear and    acquaintance", 
 "nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear", 
  "nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
 "nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear", 
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
"nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear", 
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
"nuclear", "nuclear", "nuclear", "nuclear", "notnotnot", "nuclear", 
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
"nuclear", "nuclear", "child", "nuclear", "notnotnot", "nuclear", 
"nuclear", "nuclear", "nuclear", "nuclear and acquaintance", 
"nuclear", "nuclear", "child", "alone", "notnotnot", "nuclear"
), .Dim = c(10L, 21L), .Dimnames = list(c("1", "2", "3", "4", 
"5", "6", "7", "8", "9", "10"), c("12:10", "12:20", "12:30", 
"12:40", "12:50", "13:00", "13:10", "13:20", "13:30", "13:40", 
"13:50", "14:00", "14:10", "14:20", "14:30", "14:40", "14:50", 
"15:00", "15:10", "15:20", "15:30")))


推荐答案

您只需将数据转换为 data.frame 即可使用 dplyr 那么你可以很容易地得到你想要的输出:

You just have to convert your data to a data.frame to use dplyr and then you can easily get your desired output:

require(dplyr)
# ungrouped
data_frame(var = c(dta)) %>% 
  group_by_("var") %>% 
  summarise(n())
##                             var n()
## 1                  acquaintance   1
## 2                         alone   2
## 3                         child  17
## 4                     notnotnot  19
## 5                       nuclear 131
## 6  nuclear and     acquaintance   1
## 7   nuclear and    acquaintance   1
## 8     nuclear and  acquaintance   1
## 9      nuclear and acquaintance  35
## 10                      partner   2

如果要为每个列分别执行此操作,可以使用 tidyr 首先收集结果,然后再次传播。

If you want to do this for each column seperately, you can use tidyr to first gather the result and then spread it again.

require(tidyr)
# grouped
dta %>% 
  as.data.frame %>% 
  gather %>% 
  group_by(key, value) %>% 
  summarise(N = n()) %>% 
  spread(key, N)
##                           value 12:10 12:20 12:30 12:40 12:50 13:00 13:10 13:20 13:30 13:40 13:50 14:00 14:10
## 1                  acquaintance     1    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 2                         alone    NA     1    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 3                         child     3     3     4     3     2    NA    NA    NA    NA    NA    NA    NA    NA
## 4                     notnotnot     1     1     1     1     1     1     1     1     1     1     1    NA    NA
## 5                       nuclear     3     3     3     4     5     7     7     7     7     7     7     7     7
## 6  nuclear and     acquaintance    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 7   nuclear and    acquaintance    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 8     nuclear and  acquaintance    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
## 9      nuclear and acquaintance     2     2     2     2     2     2     2     2     2     2     2     2     2
## 10                      partner    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA     1     1
## Variables not shown: 14:20 (int), 14:30 (int), 14:40 (int), 14:50 (int), 15:00 (int), 15:10 (int), 15:20 (int), 
## 15:30 (int)

这篇关于使用dplyr对序列数据的简单表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆