将 data.frame 列从因子转换为字符 [英] Convert data.frame columns from factors to characters

查看:33
本文介绍了将 data.frame 列从因子转换为字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框.我们称他为bob:

I have a data frame. Let's call him bob:

> head(bob)
                 phenotype                         exclusion
GSM399350 3- 4- 8- 25- 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399351 3- 4- 8- 25- 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399352 3- 4- 8- 25- 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399353 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399354 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-
GSM399355 3- 4- 8- 25+ 44+ 11b- 11c- 19- NK1.1- Gr1- TER119-

我想连接这个数据框的行(这将是另一个问题).但是看:

I'd like to concatenate the rows of this data frame (this will be another question). But look:

> class(bob$phenotype)
[1] "factor"

Bob 的列是因子.因此,例如:

Bob's columns are factors. So, for example:

> as.character(head(bob))
[1] "c(3, 3, 3, 6, 6, 6)"       "c(3, 3, 3, 3, 3, 3)"      
[3] "c(29, 29, 29, 30, 30, 30)"

我开始不明白这一点,但我猜这些是bob 列(caractacus 国王的法庭)的因子水平的索引?不是我需要的.

I don't begin to understand this, but I guess these are indices into the levels of the factors of the columns (of the court of king caractacus) of bob? Not what I need.

奇怪的是我可以手动浏览bob的列,然后做

Strangely I can go through the columns of bob by hand, and do

bob$phenotype <- as.character(bob$phenotype)

效果很好.并且,在一些输入之后,我可以得到一个 data.frame,它的列是字符而不是因子.所以我的问题是:我怎样才能自动做到这一点?如何将带有因子列的 data.frame 转换为带有字符列的 data.frame 而不必手动遍历每一列?

which works fine. And, after some typing, I can get a data.frame whose columns are characters rather than factors. So my question is: how can I do this automatically? How do I convert a data.frame with factor columns into a data.frame with character columns without having to manually go through each column?

额外问题:为什么手动方法有效?

Bonus question: why does the manual approach work?

推荐答案

只关注 Matt 和 Dirk.如果您想在不更改全局选项的情况下重新创建现有数据框,可以使用 apply 语句重新创建它:

Just following on Matt and Dirk. If you want to recreate your existing data frame without changing the global option, you can recreate it with an apply statement:

bob <- data.frame(lapply(bob, as.character), stringsAsFactors=FALSE)

这会将所有变量转换为字符"类,如果您只想转换因子,请参阅下面的 Marek 解决方案.

This will convert all variables to class "character", if you want to only convert factors, see Marek's solution below.

正如@hadley 指出的那样,以下内容更简洁.

As @hadley points out, the following is more concise.

bob[] <- lapply(bob, as.character)

在这两种情况下,lapply 都输出一个列表;然而,由于 R 的神奇特性,第二种情况下使用 [] 保留了 bob 对象的 data.frame 类,从而消除了转换的需要使用 as.data.frame 和参数 stringsAsFactors = FALSE 返回到 data.frame.

In both cases, lapply outputs a list; however, owing to the magical properties of R, the use of [] in the second case keeps the data.frame class of the bob object, thereby eliminating the need to convert back to a data.frame using as.data.frame with the argument stringsAsFactors = FALSE.

这篇关于将 data.frame 列从因子转换为字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆