apply()为每一列提供NA值 [英] apply() is giving NA values for every column
问题描述
我最近在apply
上遇到了一个奇怪的问题.考虑以下示例:
I've been having this strange problem with apply
lately. Consider the following example:
set.seed(42)
df <- data.frame(cars, foo = sample(LETTERS[1:5], size = nrow(cars), replace = TRUE))
head(df)
speed dist foo
1 4 2 E
2 4 10 E
3 7 4 B
4 7 22 E
5 8 16 D
6 9 10 C
我想使用apply
在该data.frame
的每一列上应用函数fun
(例如,mean
).如果data.frame
仅包含numeric
值,那么我没有任何问题:
I want to use apply
to apply a function fun
(say, mean
) on each column of that data.frame
. If the data.frame
is containing only numeric
values, I do not have any problem:
apply(cars, 2, mean)
speed dist
15.40 42.98
但是当尝试使用包含numeric
和character
数据的data.frame
时,它似乎失败了:
But when trying with my data.frame
containing numeric
and character
data, it seem to fail:
apply(df, 2, mean)
speed dist foo
NA NA NA
Warning messages:
1: In mean.default(newX[, i], ...) :
argument is not numeric or logical: returning NA
2: In mean.default(newX[, i], ..) :
argument is not numeric or logical: returning NA
3: In mean.default(newX[, i], ...) :
argument is not numeric or logical: returning NA
当然,我期望为character
列获取NA
,但是无论如何,我还是希望获取numeric
列的值.
Of course, I was expecting to get NA
for the character
column, but I would like to get values for the numeric
columns anyway.
sapply(df, class)
speed dist foo
"numeric" "numeric" "factor"
任何指针都将不胜感激,因为我感觉自己在这里遗漏了非常明显的东西!
Any pointers would be appreciated as I'm feeling like I'm missing something very obvious here!
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
推荐答案
?apply
描述的第一句话说:
如果X不是数组,而是具有非null暗的类的对象 值(例如数据框),尝试将其强制转换为数组 通过as.matrix(如果它是二维的)(例如,数据帧),或者通过 as.array.
If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (e.g., a data frame) or via as.array.
矩阵在R中只能是单一类型.当数据帧强制转换为矩阵时,即使只有一个字符列,所有内容最终都以字符结尾.
Matrices can only be of a single type in R. When the data frame is coerced to a matrix, everything ends up as a character if there is even a single character column.
我想我欠您一个替代方案的描述,所以您就到这里了.数据框实际上只是列表,因此,如果要对每个列应用函数,请改用lapply
或sapply
.
I guess I owe you an description of an alternative, so here you go. data frames are really just lists, so if you want to apply a function to each column, use lapply
or sapply
instead.
这篇关于apply()为每一列提供NA值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!