apply()为每一列提供NA值 [英] apply() is giving NA values for every column

查看:168
本文介绍了apply()为每一列提供NA值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在apply上遇到了一个奇怪的问题.考虑以下示例:

I've been having this strange problem with apply lately. Consider the following example:

set.seed(42)
df <- data.frame(cars, foo = sample(LETTERS[1:5], size = nrow(cars), replace = TRUE))
head(df)
  speed dist foo
1     4    2   E
2     4   10   E
3     7    4   B
4     7   22   E
5     8   16   D
6     9   10   C

我想使用apply在该data.frame的每一列上应用函数fun(例如,mean).如果data.frame仅包含numeric值,那么我没有任何问题:

I want to use apply to apply a function fun (say, mean) on each column of that data.frame. If the data.frame is containing only numeric values, I do not have any problem:

apply(cars, 2, mean)
speed  dist 
15.40 42.98 

但是当尝试使用包含numericcharacter数据的data.frame时,它似乎失败了:

But when trying with my data.frame containing numeric and character data, it seem to fail:

apply(df, 2, mean)
speed  dist   foo 
   NA    NA    NA 
Warning messages:
1: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(newX[, i], ..) :
  argument is not numeric or logical: returning NA                 
3: In mean.default(newX[, i], ...) :                              
  argument is not numeric or logical: returning NA

当然,我期望为character列获取NA,但是无论如何,我还是希望获取numeric列的值.

Of course, I was expecting to get NA for the character column, but I would like to get values for the numeric columns anyway.

sapply(df, class)
    speed      dist       foo 
"numeric" "numeric"  "factor" 

任何指针都将不胜感激,因为我感觉自己在这里遗漏了非常明显的东西!

Any pointers would be appreciated as I'm feeling like I'm missing something very obvious here!

> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

推荐答案

?apply描述的第一句话说:

如果X不是数组,而是具有非null暗的类的对象 值(例如数据框),尝试将其强制转换为数组 通过as.matrix(如果它是二维的)(例如,数据帧),或者通过 as.array.

If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (e.g., a data frame) or via as.array.

矩阵在R中只能是单一类型.当数据帧强制转换为矩阵时,即使只有一个字符列,所有内容最终都以字符结尾.

Matrices can only be of a single type in R. When the data frame is coerced to a matrix, everything ends up as a character if there is even a single character column.

我想我欠您一个替代方案的描述,所以您就到这里了.数据框实际上只是列表,因此,如果要对每个列应用函数,请改用lapplysapply.

I guess I owe you an description of an alternative, so here you go. data frames are really just lists, so if you want to apply a function to each column, use lapply or sapply instead.

这篇关于apply()为每一列提供NA值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆