如何在R中汇总类别变量的唯一值的计数 [英] How to aggregate count of unique values of categorical variables in R

查看:860
本文介绍了如何在R中汇总类别变量的唯一值的计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个数据集data:

x1 <- c("a","a","a","a","a","a","b","b","b","b")
x2 <- c("a1","a1","a1","a1","a1","a1","b1","b1","b2","b2")
data <- data.frame(x1,x2)

x1 x2
a  a1
a  a1 
a  a2
a  a1
a  a2
a  a3
b  b1
b  b1
b  b2 
b  b2

我想找到对应于x2

例如,a仅具有3个唯一值(a1,a2a3),而b具有2个值(b1b2)

For example a has only 3 unique values (a1,a2 and a3) and b has 2 values (b1 and b2)

我使用了aggregate(x1~.,data,sum),但由于这些是因素,而不是整数,所以它不起作用.

I used aggregate(x1~.,data,sum) but it did not work since these are factors, not integers.

请帮助

推荐答案

尝试

 aggregate(x2~x1, data, FUN=function(x) length(unique(x)))
 #  x1 x2
 #1  a  3
 #2  b  2

 rowSums(table(unique(data)))

library(dplyr)
data %>% 
     group_by(x1) %>%
     summarise(n=n_distinct(x2))

或@Eric建议的使用dplyr的其他选项

Or another option using dplyr suggested by @Eric

count(distinct(data), x1)

library(data.table)
setDT(data)[, uniqueN(x2) , x1]

更新

如果您同时需要unique值'x2'和计数

Update

If you need both the unique values of 'x2' and the count

setDT(data)[, list(n=uniqueN(x2), x2=unique(x2)) , x1]

或仅unique

setDT(data)[, list(x2=unique(x2)) , x1]

或使用dplyr

 unique(data, by=x1) %>% 
                   group_by(x1) %>%
                   mutate(n=n_distinct(x2))

仅适用于唯一值

unique(data, by=x1)

这篇关于如何在R中汇总类别变量的唯一值的计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆