表中每列值的频率 [英] Frequency of values per column in table

查看:65
本文介绍了表中每列值的频率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 dplyr 获取多列的独立频率计数的一种好方法是什么?我想从值表中查找:

What is a good way to get the independent frequency counts of multiple columns using dplyr? I want to go from a table of values:

# A tibble: 7 x 4
      a     b     c     d
  <int> <int> <int> <int>
1     1     2     1     3
2     1     2     1     3
3     2     2     5     3
4     3     2     4     3
5     3     3     2     3
6     5     3     4     3
7     5     4     2     1

到频率表如下:

# A tibble: 5 x 5
      x   a_n   b_n   c_n   d_n
  <int> <int> <int> <int> <int>
1     1     2     0     2     1
2     2     1     4     2     0
3     3     2     2     0     6
4     4     0     1     2     0
5     5     2     0     1     0

我仍在努力使 dplyr ,但似乎可以执行此操作。如果使用附加库更容易,也可以。

I'm still trying to get my head around dplyr, but it seems like this is something it could do. If it is easier to do with an add-on library, that is fine too.

推荐答案

library(dplyr)
library(reshape2)
df %>%
  melt() %>%
  dcast(value ~ variable, fun.aggregate=length)

#   value a b c d
# 1     1 2 0 2 1
# 2     2 1 4 2 0
# 3     3 2 2 0 6
# 4     4 0 1 2 0
# 5     5 2 0 1 0



数据



Data

df <- structure(list(a = c(1L, 1L, 2L, 3L, 3L, 5L, 5L), b = c(2L, 2L, 
2L, 2L, 3L, 3L, 4L), c = c(1L, 1L, 5L, 4L, 2L, 4L, 2L), d = c(3L, 
3L, 3L, 3L, 3L, 3L, 1L)), .Names = c("a", "b", "c", "d"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7"))

这篇关于表中每列值的频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆