因素,级别,R语法说明 [英] Factors, levels, R syntax explanation

查看:68
本文介绍了因素,级别,R语法说明的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对如何使用和操纵因素有一个普遍的问题.在我的工作中,R经常将某些内容强制转换为一个因子,因为R不允许在矩阵中使用不同的模式,但实际上,我希望这些列保持数字形式.

I have a generic question of how to use and manipulate factors. In my work, R often coerces something into a factor because R does not allow different modes in a matrix, but in actuality I would prefer those columns to remain numeric.

在处理这些因素时,我注意到:

When working with such factors I noticed:

  • 在不同的列中有两个相似的因子(例如1到5之间的所有值)时,将第一列因子强制为as.numeric()即可.通过as.numeric强制第二个,第三个或第四个总是向每个因数"加1.为什么?

  • When you have two similar factors (e.g all values between 1 and 5) in different columns, coercing the first column factor to a number by as.numeric() works fine. Coercing the second, third or fourth via as.numeric always adds 1 to every "factor". Why?

两者之间似乎有所不同

go$V4 <- as.double(go$V4)

AND

go[,4] <- as.numeric(levels(go[,4]))[go[,4]]

假设as.doubleas.numeric确实基本相同,区别在其他地方,但我不明白.

Assuming as.double and as.numeric are indeed largely identical, the difference is somewhere else but I don't get it.

任何语法专家吗?

推荐答案

关于强制将矩阵要求的b/c因子分解的说法是错误的. (R矩阵无法保存因子变量.)也许您正在考虑一个data.frame.正如R FAQ所说,您需要使用:

The statement about coercing to factor b/c of matrix requirements is simply wrong. (R matrices are incapable of holding factor variables.) Perhaps you are thinking of a data.frame. As the R FAQ says you need to use:

go$V4 <- as.numeric(as. character(go$V4))

如果将数字矢量(用c())连接到任何字符矢量,它将立即强制为字符"模式.如果文本字段中的一列中包含非数字字符,则输入时也会发生同样的情况.

If a numeric vector is concatenated (with c()) to any character vector, it is immediately coerced to "character" mode. If a column in a text field has non-numeric characters in it, the same thing happens on input.

这篇关于因素,级别,R语法说明的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆