因素,级别,R语法说明 [英] Factors, levels, R syntax explanation
问题描述
我对如何使用和操纵因素有一个普遍的问题.在我的工作中,R经常将某些内容强制转换为一个因子,因为R不允许在矩阵中使用不同的模式,但实际上,我希望这些列保持数字形式.
I have a generic question of how to use and manipulate factors. In my work, R often coerces something into a factor because R does not allow different modes in a matrix, but in actuality I would prefer those columns to remain numeric.
在处理这些因素时,我注意到:
When working with such factors I noticed:
-
在不同的列中有两个相似的因子(例如1到5之间的所有值)时,将第一列因子强制为
as.numeric()
即可.通过as.numeric强制第二个,第三个或第四个总是向每个因数"加1.为什么?
When you have two similar factors (e.g all values between 1 and 5) in different columns, coercing the first column factor to a number by
as.numeric()
works fine. Coercing the second, third or fourth via as.numeric always adds 1 to every "factor". Why?
两者之间似乎有所不同
go$V4 <- as.double(go$V4)
AND
go[,4] <- as.numeric(levels(go[,4]))[go[,4]]
假设as.double
和as.numeric
确实基本相同,区别在其他地方,但我不明白.
Assuming as.double
and as.numeric
are indeed largely identical, the difference is somewhere else but I don't get it.
任何语法专家吗?
推荐答案
关于强制将矩阵要求的b/c因子分解的说法是错误的. (R矩阵无法保存因子变量.)也许您正在考虑一个data.frame.正如R FAQ所说,您需要使用:
The statement about coercing to factor b/c of matrix requirements is simply wrong. (R matrices are incapable of holding factor variables.) Perhaps you are thinking of a data.frame. As the R FAQ says you need to use:
go$V4 <- as.numeric(as. character(go$V4))
如果将数字矢量(用c()
)连接到任何字符矢量,它将立即强制为字符"模式.如果文本字段中的一列中包含非数字字符,则输入时也会发生同样的情况.
If a numeric vector is concatenated (with c()
) to any character vector, it is immediately coerced to "character" mode. If a column in a text field has non-numeric characters in it, the same thing happens on input.
这篇关于因素,级别,R语法说明的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!