apply() 为每一列提供 NA 值 [英] apply() is giving NA values for every column

查看:25
本文介绍了apply() 为每一列提供 NA 值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在使用 apply 时遇到了这个奇怪的问题.考虑以下示例:

I've been having this strange problem with apply lately. Consider the following example:

set.seed(42)
df <- data.frame(cars, foo = sample(LETTERS[1:5], size = nrow(cars), replace = TRUE))
head(df)
  speed dist foo
1     4    2   E
2     4   10   E
3     7    4   B
4     7   22   E
5     8   16   D
6     9   10   C

我想使用 applydata.frame 的每一列上应用一个函数 fun(比如,mean).如果 data.frame 只包含 numeric 值,我没有任何问题:

I want to use apply to apply a function fun (say, mean) on each column of that data.frame. If the data.frame is containing only numeric values, I do not have any problem:

apply(cars, 2, mean)
speed  dist 
15.40 42.98 

但是当我尝试使用包含 numericcharacter 数据的 data.frame 时,它似乎失败了:

But when trying with my data.frame containing numeric and character data, it seem to fail:

apply(df, 2, mean)
speed  dist   foo 
   NA    NA    NA 
Warning messages:
1: In mean.default(newX[, i], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(newX[, i], ..) :
  argument is not numeric or logical: returning NA                 
3: In mean.default(newX[, i], ...) :                              
  argument is not numeric or logical: returning NA

当然,我希望获得 character 列的 NA,但无论如何我想获得 numeric 列的值.

Of course, I was expecting to get NA for the character column, but I would like to get values for the numeric columns anyway.

sapply(df, class)
    speed      dist       foo 
"numeric" "numeric"  "factor" 

任何指点将不胜感激,因为我觉得我在这里遗漏了一些非常明显的东西!

Any pointers would be appreciated as I'm feeling like I'm missing something very obvious here!

> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

推荐答案

?apply 描述的第一句话说:

如果 X 不是一个数组而是一个具有非空 dim 的类的对象值(例如数据框),尝试将其强制转换为数组如果是二维的(例如,数据框),则通过 as.matrix 或通过数组.

If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (e.g., a data frame) or via as.array.

在 R 中,矩阵只能是单一类型.当数据框被强制转换为矩阵时,即使只有一个字符列,所有内容都以字符结束.

Matrices can only be of a single type in R. When the data frame is coerced to a matrix, everything ends up as a character if there is even a single character column.

我想我欠你一个替代方案的描述,所以你去吧.数据框实际上只是列表,因此如果您想将函数应用于每一列,请改用 lapplysapply.

I guess I owe you an description of an alternative, so here you go. data frames are really just lists, so if you want to apply a function to each column, use lapply or sapply instead.

这篇关于apply() 为每一列提供 NA 值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆