使用data.table set()将所有列从整数转换为数值 [英] Use data.table set() to convert all columns from integer to numeric

查看:35
本文介绍了使用data.table set()将所有列从整数转换为数值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用具有1900列和大约280,000行的data.table.

I am working with a data.table that has 1900 columns and roughly 280,000 rows.

当前,数据完全是整数",但我希望它们显式地数字",以便稍后将其传递给bigcor()函数.显然,bigcor()只能处理数字",而不能处理整数".

Currently, the data is entirely "integer", but I want them to explicitly "numeric" so I can pass it to a bigcor() function later. Apparently, bigcor() can only handle "numeric" and not "integer".

我尝试过:

full.bind <- full.bind[,sapply(full.bind, as.numeric), with=FALSE]

不幸的是,我得到了错误:

Unfortunately, I get the error:

Error in `[.data.table`(full.bind, , sapply(full.bind, as.numeric), with = FALSE) : 
  j out of bounds

因此,我尝试使用data.table set()函数,但出现错误:

So, I tried using the data.table set() function, but I get the error:

Error in set(full.bind, value = as.numeric(full.bind)) : 
  (list) object cannot be coerced to type 'double'

我创建了一个简单的可复制示例.请记住,实际的列不是"a","b"或"c";它们是非常复杂的列名,因此不可能单独引用列.

I have created a simple reproducible example. Keep in mind, the actual columns are NOT "a", "b", or "c"; they are extremely complicated column names so referencing column individually is not a possibility.

dt <- data.table(a=1:10, b=1:10, c=1:10)

所以,我的最后一个问题是:

So, my final questions are:

1)为什么我的套用技术不起作用?(什么是"j越界"错误?)2)为什么set()技术没有?(为什么不能将data.table强制转换为数字?)3)bigcor()函数是否需要数字对象,还是有其他问题?

1) Why does my sapply technique not work? (what is the "j out of bounds" error?) 2) Why does the set() technique not? (why can't the data.table be coerced to numeric?) 3) Does the bigcor() function require a numeric object, or is there another problem?

推荐答案

使用 .SD 并通过引用进行分配:

Use .SD and assignment by reference:

library(data.table)
dt <- data.table(a=1:10, b=1:10, c=1:10)
sapply(dt, class)
#        a         b         c 
#"integer" "integer" "integer"

dt[, names(dt) := lapply(.SD, as.numeric)]
sapply(dt, class)
#        a         b         c 
#"numeric" "numeric" "numeric"

set 在这里仅适用于一列(请注意文档,它没有说 j 是可选的),因为必须生成每个替换列.如果要使用它们,则需要遍历各列(例如,使用 for 循环).可能更可取,因为它需要较少的内存(额外的内存需求对应于一列,而整个data.table则需要额外的内存).

set only works for one column here (note the documentation, which doesn't say that j is optional), because each replacement column has to be generated. You would need to loop over the columns (e.g., using a for loop) if you want to use it. It might be preferable because it needs less memory (additional memory need corresponds to one column whereas additional memory for the whole data.table is needed with the first approach).

for (k in seq_along(dt)) set(dt, j = k, value = as.character(dt[[k]]))
sapply(dt, class)
#         a           b           c 
#"character" "character" "character"

但是, bigcor (来自包传播)需要矩阵作为输入,而 data.table 不是矩阵.因此,您的问题不是列类型,而是需要使用 as.matrix(dt).

However, bigcor (from package propagate) requires a matrix as input and a data.table isn't a matrix. So, your problem is not the column type, but you need to use as.matrix(dt).

这篇关于使用data.table set()将所有列从整数转换为数值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆