计算 R 中因子的出现次数,报告的计数为零 [英] Count occurrences of factor in R, with zero counts reported

查看:76
本文介绍了计算 R 中因子的出现次数,报告的计数为零的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想计算一个因子在数据框中出现的次数.例如,要计算下面代码中给定类型的事件数:

I want to count the number of occurrences of a factor in a data frame. For example, to count the number of events of a given type in the code below:

library(plyr)
events <- data.frame(type = c('A', 'A', 'B'),
                       quantity = c(1, 2, 1))
ddply(events, .(type), summarise, quantity = sum(quantity))

输出如下:

     type quantity
1    A        3
2    B        1

但是,如果我知道有 ABC 三种类型的事件,我还想看到C0 的计数?换句话说,我希望输出是:

However, what if I know that there are three types of events A, B and C, and I also want to see the count for C which is 0? In other words, I want the output to be:

     type quantity
1    A        3
2    B        1
3    C        0

我该怎么做?感觉应该在某处定义一个函数来执行此操作.

How do I do this? It feels like there should be a function defined to do this somewhere.

以下是我关于如何解决这个问题的两个不太好的想法.

The following are my two not-so-good ideas about how to go about this.

想法 #1: 我知道我可以通过使用 for 循环来做到这一点,但我知道人们普遍认为,如果您使用的是 R 中的 for 循环,那么你做错了什么,一定有更好的方法来做.

Idea #1: I know I could do this by using a for loop, but I know that it is widely said that if you are using a for loop in R, then you are doing something wrong, there must be a better way to do it.

想法 #2: 向原始数据框添加虚拟条目.这个解决方案有效,但感觉应该有一个更优雅的解决方案.

Idea #2: Add dummy entries to the original data frame. This solution works but it feels like there should be a more elegant solution.

events <- data.frame(type = c('A', 'A', 'B'),
                       quantity = c(1, 2, 1))
events <- rbind(events, data.frame(type = 'C', quantity = 0))
ddply(events, .(type), summarise, quantity = sum(quantity))

推荐答案

如果您将 events 变量正确定义为具有所需三个级别的因子,您将免费获得此:

You get this for free if you define your events variable correctly as a factor with the desired three levels:

R> events <- data.frame(type = factor(c('A', 'A', 'B'), c('A','B','C')), 
+                       quantity = c(1, 2, 1))
R> events
  type quantity
1    A        1
2    A        2
3    B        1
R> table(events$type)

A B C 
2 1 0 
R> 

简单地在因子上调用 table() 已经做了正确的事情,ddply() 也可以如果你告诉它不要drop:

Simply calling table() on the factor already does the right thing, and ddply() can too if you tell it not to drop:

R> ddply(events, .(type), summarise, quantity = sum(quantity), .drop=FALSE)
  type quantity
1    A        3
2    B        1
3    C        0
R> 

这篇关于计算 R 中因子的出现次数,报告的计数为零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆