R如何自动将字符输入强制转换为数字? [英] How R automatically coerces character input to numeric?
问题描述
我正在 randomForest
包中训练我的数据的随机森林模型。有些变量属于字符类。我很确定 randomForest
仅将因子和数字类作为输入。因此,我认为R会自动将字符强制转换为数字。
I am training a random forest model in the randomForest
package for my data. Some variables are in the class of character. I am pretty sure that randomForest
will only take factor and numeric classes as input. So I think R automatically coerces the character into numeric.
为了让我知道这会如何影响我的建模结果,有人知道R如何将字符自动强制转换为数字类(例如算法/规则)?还是我可以查看的任何源代码?
In order for me to know how this may affect my modelling result, does anyone know how R automatically coerces the character into numeric class (like an algorithm/rule)? Or any source code I can look at?
我正在使用R版本4.0.1。
I am using R version 4.0.1.
预先感谢。
更新:
我使用了
An update: I checked using
getTree(mod,1,labelVar=TRUE)
我可以看到,如果将这些字符变量转换为因子,则分割点将变为在输出中为整数(表示它是类别变量(请参阅: https://www.rdocumentation.org/packages/randomForest/versions/4.6-14/topics/getTree ))。但是,如果不转换为因子,则分裂点将被分解。在输出中不是整数。
And I can see that if those character variables are converted to factors, then the "split point" in the output is an integer (which means it is a categorical variable (see: https://www.rdocumentation.org/packages/randomForest/versions/4.6-14/topics/getTree)). But if not converted to factors, then the "split point" in the output is not integer.
所以我想R是将那些字符变量的值强制转换为数字值吗?
So I guess is that R coerces the values of those character variables into numeric values? But how?
推荐答案
目前不确定R中的随机森林,但我确信它仅需要个因素
s。如果确实也需要个字符
,它将把它们转换为因数,而不是数字。
Not sure right now regarding the random forests in R, but I am kind of convinced, that it only takes factor
s. If it does take character
s as well, it will convert them to factor, not to numeric.
并且没有明确的转换从字符到R中的数字。
And there is no clear conversion from character to numeric in R.
这篇关于R如何自动将字符输入强制转换为数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!