在R中将字符向量列表转换为数字的最快方法 [英] Fastest way to convert a list of character vectors to numeric in R
本文介绍了在R中将字符向量列表转换为数字的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在R
中,将包含一组字符数字(作为字符向量)的列表转换为数字的最快方法是什么?
In R
, what is the fastest way to convert a list containing suites of character numbers (as character vectors) into numeric?
具有以下虚拟数据:
set.seed(2)
N = 1e7
ncol = 10
myT = formatC(matrix(runif(N), ncol = ncol)) # A matrix converted to characters
# Each row is collapsed into a single suite of characters:
myT = apply(myT, 1, function(x) paste(x, collapse=' ') )
head(myT)
制作:
[1] "0.1849 0.855 0.8272 0.5403 0.3891 0.5184 0.7776 0.5533 0.1566 0.01591"
[2] "0.7024 0.1008 0.9442 0.8582 0.3184 0.9289 0.9957 0.1311 0.2131 0.07355"
[3] "0.5733 0.5493 0.3915 0.4423 0.8522 0.6042 0.9265 0.006878 0.7052 0.71"
[... etc ...]
我可以做到
library(stringi)
# In the actual dataset, the number of spaces between numbers may vary, hence "\\s+"
system.time(newT <- lapply(stri_split_regex(myT, "\\s+", omit_empty=T), as.numeric))
newT <- unlist(newT) # Final goal is to have a single vector of numbers
在具有64位和16GB系统(在ubuntu下)的Intel Core i7 2.10GHz上:
On my Intel Core i7 2.10GHz with 64-bit and 16GB system (under ubuntu):
user system elapsed
3.748 0.008 3.757
对于真实数据集(ncol=150
和N~1e9
),这太长了.
还有更好的选择吗?
With the real dataset (ncol=150
and N~1e9
), this is way too long.
Any better option?
推荐答案
这是我系统上速度的两倍:
This is twice as fast on my system:
x <- paste(myT, collapse = "\n")
library(data.table)
DT <- fread(x)
newT2 <- c(t(DT))
这篇关于在R中将字符向量列表转换为数字的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文