R计数NA [英] R count NA by group

查看:518
本文介绍了R计数NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以解释为什么我得到不同的答案使用聚合函数来计数组的缺失值?此外,是否有更好的方法来计算组使用原生R函数的缺失值?

Could someone please explain why I get different answers using the aggregate function to count missing values by group? Also, is there a better way to count missing values by group using a native R function?

DF <- data.frame(YEAR=c(2000,2000,2000,2001,2001,2001,2001,2002,2002,2002), X=c(1,NA,3,NA,NA,NA,7,8,9,10))
DF

aggregate(X ~ YEAR, data=DF, function(x) { sum(is.na(x)) })
with(DF, aggregate(X, list(YEAR), function(x) { sum(is.na(x)) }))

aggregate(X ~ YEAR, data=DF, function(x) { sum(! is.na(x)) })
with(DF, aggregate(X, list(YEAR), function(x) { sum(! is.na(x)) }))


推荐答案

?指出公式方法具有参数 na.action ,其默认设置为 na.omit

The help page at ?aggregate points out that the formula method has an argument na.action which is set by default to na.omit.


na.action 数据包含 NA 值。默认值是忽略给定变量中缺失的值。

na.action: a function which indicates what should happen when the data contain NA values. The default is to ignore missing values in the given variables.

将该参数更改为 NULL na.pass ,以取得您可能预期的结果:

Change that argument to NULL or na.pass instead to get the results you are probably expecting:

# aggregate(X ~ YEAR, data=DF, function(x) {sum(is.na(x))}, na.action = na.pass)
aggregate(X ~ YEAR, data=DF, function(x) {sum(is.na(x))}, na.action = NULL)
#   YEAR X
# 1 2000 1
# 2 2001 3
# 3 2002 0

这篇关于R计数NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆