转换data.table中的列类 [英] Convert column classes in data.table

查看:203
本文介绍了转换data.table中的列类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用data.table时遇到问题:如何转换列类?这里是一个简单的例子:with data.frame我没有转换它的问题,与data.table我不知道如何:

  df < -  data.frame(ID = c(rep(A,5),rep(B,5)),Quarter = c(1:5,1:5) value = rnorm(10))
#一种方式:http://stackoverflow.com/questions/2851015/r-convert-data-frame-columns-from-factors-tocharacters
df< ; - data.frame(lapply(df,as.character),stringsAsFactors = FALSE)
#Another way
df [,value]< - as.numeric(df [,value ])

library(data.table)
dt< - data.table(ID = c(rep(A,5),rep(B,5)) ,Quarter = c(1:5,1:5),value = rnorm(10))
dt #Error in rep(,ncol(xi)):无效的'times'参数
#产生错误,data.table没有选项stringsAsFactors?
dt [,ID,with = FALSE]< - as.character(dt [,ID,with = FALSE])
#Produces错误: data.table`(`* tmp *`,,ID,其中= FALSE,value =c(1,1,1,1,1,2,2,2,2,2)):
#unused argument(s)(with = FALSE)



由于Matthew的帖子更新:我以前使用过较旧的版本,但即使更新到1.6.6(我现在使用的版本)后,我仍然得到一个错误。 / p>

更新2:假设我想将类factor的每一列转换为一个字符列,但不提前知道哪个列是哪个类。使用data.frame,我可以执行以下操作:

  classes<  -  as.character(sapply(df,class) )
colClasses< - which(classes ==factor)
df [,colClasses]< - sapply(df [,colClasses],as.character)

我可以使用data.table类似的东西吗?



更新3: / p>


sessionInfo()
R版本2.13.1(2011-07-08)
平台:x86_64-pc- mingw32 / x64(64位)




  locale:
[1] C

附加的基本包:
[1] stats graphics grDevices utils数据集方法base

其他附加包:
[1] data.table_1.6.6

通过命名空间加载(未附加):
[1] tools_2.13.1


解决方案

对于单个列:

  dtnew < dt [,Quarter:= as.character(Quarter)] 
str(dtnew)

类'data.table'和'data.frame':10 obs。的3个变量:
$ ID:因子w / 2级别A,B:1 1 1 1 1 2 2 2 2 2
$季度:chr12 4 ... ...
$ value:num -0.838 0.146 -1.059 -1.197 0.282 ...






使用 lapply as.character

  dtnew <-dt [,lapply(.SD,as.character),by = ID] 
str dtnew)

类'data.table'和'data.frame':10 obs。的3个变量:
$ ID:因子w / 2级别A,B:1 1 1 1 1 2 2 2 2 2
$季度:chr12 4 ... ...
$ value:chr1.487145280568-0.8278452183588810.0289771827700021.35392750102305...


I have a problem using data.table: How do I convert column classes? Here is a simple example: With data.frame I don't have a problem converting it, with data.table I just don't know how:

df <- data.frame(ID=c(rep("A", 5), rep("B",5)), Quarter=c(1:5, 1:5), value=rnorm(10))
#One way: http://stackoverflow.com/questions/2851015/r-convert-data-frame-columns-from-factors-to-characters
df <- data.frame(lapply(df, as.character), stringsAsFactors=FALSE)
#Another way
df[, "value"] <- as.numeric(df[, "value"])

library(data.table)
dt <- data.table(ID=c(rep("A", 5), rep("B",5)), Quarter=c(1:5, 1:5), value=rnorm(10))
dt <- data.table(lapply(dt, as.character), stringsAsFactors=FALSE) 
#Error in rep("", ncol(xi)) : invalid 'times' argument
#Produces error, does data.table not have the option stringsAsFactors?
dt[, "ID", with=FALSE] <- as.character(dt[, "ID", with=FALSE]) 
#Produces error: Error in `[<-.data.table`(`*tmp*`, , "ID", with = FALSE, value = "c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2)") : 
#unused argument(s) (with = FALSE)

Do I miss something obvious here?

Update due to Matthew's post: I used an older version before, but even after updating to 1.6.6 (the version I use now) I still get an error.

Update 2: Let's say I want to convert every column of class "factor" to a "character" column, but don't know in advance which column is of which class. With a data.frame, I can do the following:

classes <- as.character(sapply(df, class))
colClasses <- which(classes=="factor")
df[, colClasses] <- sapply(df[, colClasses], as.character)

Can I do something similar with data.table?

Update 3:

sessionInfo() R version 2.13.1 (2011-07-08) Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.6.6

loaded via a namespace (and not attached):
[1] tools_2.13.1

解决方案

For a single column:

dtnew <- dt[, Quarter:=as.character(Quarter)]
str(dtnew)

Classes ‘data.table’ and 'data.frame':  10 obs. of  3 variables:
 $ ID     : Factor w/ 2 levels "A","B": 1 1 1 1 1 2 2 2 2 2
 $ Quarter: chr  "1" "2" "3" "4" ...
 $ value  : num  -0.838 0.146 -1.059 -1.197 0.282 ...


Using lapply and as.character:

dtnew <- dt[, lapply(.SD, as.character), by=ID]
str(dtnew)

Classes ‘data.table’ and 'data.frame':  10 obs. of  3 variables:
 $ ID     : Factor w/ 2 levels "A","B": 1 1 1 1 1 2 2 2 2 2
 $ Quarter: chr  "1" "2" "3" "4" ...
 $ value  : chr  "1.487145280568" "-0.827845218358881" "0.028977182770002" "1.35392750102305" ...

这篇关于转换data.table中的列类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆