R Datatable,将函数应用于列的子集 [英] R Datatable, apply a function to a subset of columns
问题描述
我试图将一个函数应用于大型data.table中的一组列,而不必单独引用每个列。
a< - data.table(
a = as.character(rnorm(5)),
b = as.character(rnorm(5)),
c = as.character rnorm(5)),
d = as.character(rnorm(5))
)
b < - c('a','b','c','d'
与以上MWE相同,此:
a [,b = as.numeric(b),with = F]
有效,但是:
a [,b [2:3]:= data.table .numeric(b [2:3])),其中= F]
将 as.numeric
函数应用于 a
的第2列和第3列而不引用它们的正确方法是什么
(在实际数据集中有几十列,因此这是不切实际的)
惯用的方法是使用 .SD
和 .SDcols
您可以通过在
a [,(b):= lapply(.SD,as.numeric) SDcols = b]
对于第2列:3
a [,2:3:= lapply(.SD,as.numeric),.SDcols = 2:3]
pre>
或
mysubset< - 2:3
a [,(mysubset):= lapply(.SD,as.numeric),.SDcols = mysubset]
I'm trying to apply a function to a group of columns in a large data.table without referring to each one individually.
a <- data.table( a=as.character(rnorm(5)), b=as.character(rnorm(5)), c=as.character(rnorm(5)), d=as.character(rnorm(5)) ) b <- c('a','b','c','d')
with the MWE above, this:
a[,b=as.numeric(b),with=F]
works, but this:
a[,b[2:3]:=data.table(as.numeric(b[2:3])),with=F]
doesn't work. What is the correct way to apply the
as.numeric
function to just columns 2 and 3 ofa
without referring to them individually.(In the actual data set there are tens of columns so it would be impractical)
Thanks
解决方案The idiomatic approach is to use
.SD
and.SDcols
You can force the RHS to be evaluated in the parent frame by wrapping in
()
a[, (b) := lapply(.SD, as.numeric), .SDcols = b]
For columns 2:3
a[, 2:3 := lapply(.SD, as.numeric), .SDcols = 2:3]
or
mysubset <- 2:3 a[, (mysubset) := lapply(.SD, as.numeric), .SDcols = mysubset]
这篇关于R Datatable,将函数应用于列的子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!