将data.frame列从因子转换为字符 [英] Convert data.frame columns from factors to characters

查看:631
本文介绍了将data.frame列从因子转换为字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框。让我们叫他 bob

 >头(bob)
表型排除
GSM399350 3- 4- 8- 25-44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399351 3- 4- 8- 25-44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399352 3- 4- 8- 25-44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399353 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399354 3- 4- 8- 25+ 44+ 11b-11c- 19- NK1.1 - Gr1- TER119-
GSM399355 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-

我想连接这个数据框的行(这将是另一个问题)。但是看看:

 > class(bob $ phenotype)
[1]因子

Bob 的列是因素。所以,例如:

 > (3),3(3,3,3,3)
[1]c(3,3,3,6,6,6) $ b [3]c(29,29,29,30,30,30)

我不明白这一点,但是我猜这些都是指向code> bob 栏的因子的指标。不是我需要的。



我可以手动查看 bob 的列,并执行

  bob $ phenotype<  -  as.character(bob $ phenotype)

哪个工作正常。而且,经过一些打字,我可以得到一个数据框,其列是字符而不是因子。所以我的问题是:我该如何自动执行?如何将具有因子列的data.frame转换为带有字符列的数据框架,而无需手动通过每一列?



奖金问题:为什么手动方法有效?

解决方案

跟随马特和德克。如果要重新创建现有的数据框架而不更改全局选项,则可以使用apply语句重新创建它:

  bob< ;  -  data.frame(lapply(bob,as.character),stringsAsFactors = FALSE)

将所有变量转换为类character,如果您只想转换因子,请参阅下面的Marek解决方案



正如@hadley指出的,以下是更简洁的。

  bob []<  -  lapply(bob,as.character)
/ pre>

在这两种情况下, lapply 输出列表;然而,由于R的神奇属性,在第二种情况下使用 [] 会保留 bob 对象,从而无需使用 as.data.frame 转换回数据框,参数 stringsAsFactors = FALSE


I have a data frame. Let's call him bob:

> head(bob)
                 phenotype                         exclusion
GSM399350 3- 4- 8- 25- 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399351 3- 4- 8- 25- 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399352 3- 4- 8- 25- 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399353 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399354 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399355 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-

I'd like to concatenate the rows of this data frame (this will be another question). But look:

> class(bob$phenotype)
[1] "factor"

Bob's columns are factors. So, for example:

> as.character(head(bob))
[1] "c(3, 3, 3, 6, 6, 6)"       "c(3, 3, 3, 3, 3, 3)"      
[3] "c(29, 29, 29, 30, 30, 30)"

I don't begin to understand this, but I guess these are indices into the levels of the factors of the columns (of the court of king caractacus) of bob? Not what I need.

Strangely I can go through the columns of bob by hand, and do

bob$phenotype <- as.character(bob$phenotype)

which works fine. And, after some typing, I can get a data.frame whose columns are characters rather than factors. So my question is: how can I do this automatically? How do I convert a data.frame with factor columns into a data.frame with character columns without having to manually go through each column?

Bonus question: why does the manual approach work?

解决方案

Just following on Matt and Dirk. If you want to recreate your existing data frame without changing the global option, you can recreate it with an apply statement:

bob <- data.frame(lapply(bob, as.character), stringsAsFactors=FALSE)

This will convert all variables to class "character", if you want to only convert factors, see Marek's solution below.

As @hadley points out, the following is more concise.

bob[] <- lapply(bob, as.character)

In both cases, lapply outputs a list; however, owing to the magical properties of R, the use of [] in the second case keeps the data.frame class of the bob object, thereby eliminating the need to convert back to a data.frame using as.data.frame with the argument stringsAsFactors = FALSE.

这篇关于将data.frame列从因子转换为字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆