意外转换为字符而不是数据帧和矩阵中的因素 [英] Unexpected conversion to chars instead of factors in data frames and matrices

查看:128
本文介绍了意外转换为字符而不是数据帧和矩阵中的因素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不是R的新手用户,但以下是最令人困惑的。



我有一个数据框(虽然问题同样适用于矩阵)值为+ 1 / -1的分类变量,我想转换为因子。

  mat<  -  matrix (样本(c(-1,+1),16,replace = T),nrow = 4)
mat< - data.frame(mat)
pre>

但是,使用

  mat<  -  apply ,2,factor)

将整数变成字符而不是因素:

 > mat 
[,1] [,2] [,3] [,4]
[1,]-11-11
[ ]-1-1-1-1
[3,]-1111
[4,]-1 -111

也许在同一静脉(和我有一个问题这样我的一些其他数据)尝试将矩阵和数据框中的字符名称转换为因素导致更混乱的行为:

  mat2<  - 矩阵(样本(字母,16,替换= T),nrow = 4)
> mat2
[,1] [,2] [,3] [,4]
[1,]xmre
[2,] urbp
[3,]jphj
[4,]ksex

mat2 [,1]< - factor(mat2 [,1])$ ​​b $ b> mat2
[,1] [,2] [,3] [,4]
[1,]4mre
[2,] 3rbp
[3,]1phj
[4,]2sex

任何帮助或澄清将不胜感激。

解决方案

永远记住,数据帧是列表,所以在列上操作就像迭代列表的元素。我想也许你打算做这样的事情:

  mat []<  -  lapply(mat,factor)

或此:

  as.data.frame(lapply(mat,factor))

虽然这里,请注意,每个因素的水平不一样!


I am not a novice user of R, but the following is most confusing.

I have a data frame (although the problem is equally present for matrices) of categorical variables taking the values +1/-1, which I'd like to convert into factors.

mat <- matrix(sample(c(-1, +1), 16, replace = T), nrow = 4)
mat <- data.frame(mat)

However, using

mat <- apply(mat, 2, factor)

turns integers into characters instead of factors:

> mat 
     [,1] [,2] [,3] [,4]
[1,] "-1" "1"  "-1" "1" 
[2,] "-1" "-1" "-1" "-1"
[3,] "-1" "1"  "1"  "1" 
[4,] "-1" "-1" "1"  "1" 

Perhaps in the same vein (and I had a problem of this sort with some of my other data) trying to convert character names in matrices and data frames into factors results in more confusing behaviour:

 mat2 <- matrix(sample(letters, 16, replace = T), nrow = 4)
 > mat2
     [,1] [,2] [,3] [,4]
 [1,] "x"  "m"  "r"  "e" 
 [2,] "u"  "r"  "b"  "p" 
 [3,] "j"  "p"  "h"  "j" 
 [4,] "k"  "s"  "e"  "x" 

mat2[,1] <- factor(mat2[,1])
> mat2
     [,1] [,2] [,3] [,4]
 [1,] "4"  "m"  "r"  "e" 
 [2,] "3"  "r"  "b"  "p" 
 [3,] "1"  "p"  "h"  "j" 
 [4,] "2"  "s"  "e"  "x" 

any help or clarification would be appreciated.

解决方案

Always remember that data frames are lists, and so operating on columns is just like iterating over elements of a list. I think maybe you intended to do something more like this:

mat[] <- lapply(mat,factor)

or this:

as.data.frame(lapply(mat,factor))

Although even here, note that the levels of each factor are not the same!

这篇关于意外转换为字符而不是数据帧和矩阵中的因素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆