apply() 为每一列提供 NA 值 [英] apply() is giving NA values for every column
问题描述
我最近在使用 apply
时遇到了这个奇怪的问题.考虑以下示例:
I've been having this strange problem with apply
lately. Consider the following example:
set.seed(42)
df <- data.frame(cars, foo = sample(LETTERS[1:5], size = nrow(cars), replace = TRUE))
head(df)
speed dist foo
1 4 2 E
2 4 10 E
3 7 4 B
4 7 22 E
5 8 16 D
6 9 10 C
我想使用 apply
在 data.frame 的每一列上应用一个函数
fun
(比如,mean
)代码>.如果 data.frame
只包含 numeric
值,我没有任何问题:
I want to use apply
to apply a function fun
(say, mean
) on each column of that data.frame
. If the data.frame
is containing only numeric
values, I do not have any problem:
apply(cars, 2, mean)
speed dist
15.40 42.98
但是当我尝试使用包含 numeric
和 character
数据的 data.frame
时,它似乎失败了:
But when trying with my data.frame
containing numeric
and character
data, it seem to fail:
apply(df, 2, mean)
speed dist foo
NA NA NA
Warning messages:
1: In mean.default(newX[, i], ...) :
argument is not numeric or logical: returning NA
2: In mean.default(newX[, i], ..) :
argument is not numeric or logical: returning NA
3: In mean.default(newX[, i], ...) :
argument is not numeric or logical: returning NA
当然,我希望获得 character
列的 NA
,但无论如何我想获得 numeric
列的值.
Of course, I was expecting to get NA
for the character
column, but I would like to get values for the numeric
columns anyway.
sapply(df, class)
speed dist foo
"numeric" "numeric" "factor"
任何指点将不胜感激,因为我觉得我在这里遗漏了一些非常明显的东西!
Any pointers would be appreciated as I'm feeling like I'm missing something very obvious here!
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
推荐答案
?apply
描述的第一句话说:
如果 X 不是一个数组而是一个具有非空 dim 的类的对象值(例如数据框),尝试将其强制转换为数组如果是二维的(例如,数据框),则通过 as.matrix 或通过数组.
If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (e.g., a data frame) or via as.array.
在 R 中,矩阵只能是单一类型.当数据框被强制转换为矩阵时,即使只有一个字符列,所有内容都以字符结束.
Matrices can only be of a single type in R. When the data frame is coerced to a matrix, everything ends up as a character if there is even a single character column.
我想我欠你一个替代方案的描述,所以你去吧.数据框实际上只是列表,因此如果您想将函数应用于每一列,请改用 lapply
或 sapply
.
I guess I owe you an description of an alternative, so here you go. data frames are really just lists, so if you want to apply a function to each column, use lapply
or sapply
instead.
这篇关于apply() 为每一列提供 NA 值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!