我在R中使用Apply时会丢失班级信息 [英] Losing Class information when I use apply in R
问题描述
当我使用apply将数据帧的一行传递给函数时,我丢失了该行元素的类信息.它们都变成了字符".以下是一个简单的示例.我想在3个臭皮匠的年龄中增加几年的时间.当我尝试将2的值相加时,R会说二进制运算符的非数字参数".如何避免这种情况?
When I pass a row of a data frame to a function using apply, I lose the class information of the elements of that row. They all turn into 'character'. The following is a simple example. I want to add a couple of years to the 3 stooges ages. When I try to add 2 a value that had been numeric R says "non-numeric argument to binary operator." How do I avoid this?
age = c(20, 30, 50)
who = c("Larry", "Curly", "Mo")
df = data.frame(who, age)
colnames(df) <- c( '_who_', '_age_')
dfunc <- function (er) {
print(er['_age_'])
print(er[2])
print(is.numeric(er[2]))
print(class(er[2]))
return (er[2] + 2)
}
a <- apply(df,1, dfunc)
输出如下:
_age_
"20"
_age_
"20"
[1] FALSE
[1] "character"
Error in er[2] + 2 : non-numeric argument to binary operator
推荐答案
apply
仅对矩阵(所有元素具有相同类型)真正起作用.在data.frame
上运行它时,它只会首先调用as.matrix
.
apply
only really works on matrices (which have the same type for all elements). When you run it on a data.frame
, it simply calls as.matrix
first.
解决此问题的最简单方法是仅对数字列进行操作:
The easiest way around this is to work on the numeric columns only:
# skips the first column
a <- apply(df[, -1, drop=FALSE],1, dfunc)
# Or in two steps:
m <- as.matrix(df[, -1, drop=FALSE])
a <- apply(m,1, dfunc)
drop=FALSE
是必需的,以避免获得单个列向量.
-1
表示第一列,但您可以显式指定所需的列,例如df[, c('foo', 'bar')]
The drop=FALSE
is needed to avoid getting a single column vector.
-1
means all-but-the first column, you could instead explicitly specify the columns you want, for example df[, c('foo', 'bar')]
更新
如果您希望函数一次访问一个完整的data.frame行,则有(至少)两个选项:
If you want your function to access one full data.frame row at a time, there are (at least) two options:
# "loop" over the index and extract a row at a time
sapply(seq_len(nrow(df)), function(i) dfunc(df[i,]))
# Use split to produce a list where each element is a row
sapply(split(df, seq_len(nrow(df))), dfunc)
第一种选择可能更适合大型数据帧,因为它不必预先创建巨大的列表结构.
The first option is probably better for large data frames since it doesn't have to create a huge list structure upfront.
这篇关于我在R中使用Apply时会丢失班级信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!