data.table中的动态列名称 [英] Dynamic column names in data.table
问题描述
我试图将列添加到我的 data.table
中,其中的名称是动态的。另外,添加这些列时,我需要使用 by
参数。例如:
I am trying to add columns to my data.table
, where the names are dynamic. I addition I need to use the by
argument when adding these columns. For example:
test_dtb <- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
cn <- parse(text = "blah")
test_dtb[ , eval(cn) := mean(a), by = id]
# Error in `[.data.table`(test_dtb, , `:=`(eval(cn), mean(a)), by = id) :
# LHS of := must be a single column name when with=TRUE. When with=FALSE the LHS may be a vector of column names or positions.
另一种尝试:
cn <- "blah"
test_dtb[ , cn := mean(a), by = id, with = FALSE]
# Error in `[.data.table`(test_dtb, , `:=`(cn, mean(a)), by = id, with = FALSE) : 'with' must be TRUE when 'by' or 'keyby' is provided
Matthew的更新:
现在可以在R-Forge的v1.8.3中使用。感谢突出显示!
请参见以下类似示例的新示例:
This now works in v1.8.3 on R-Forge. Thanks for highlighting!
See this similar question for new examples:
推荐答案
从 data.table 1.9.4
,您可以执行以下操作:
From data.table 1.9.4
, you can just do this:
## A parenthesized symbol, `(cn)`, gets evaluated to "blah" before `:=` is carried out
test_dtb[, (cn) := mean(a), by = id]
head(test_dtb, 4)
# a b id blah
# 1: 41 19 1 54.2
# 2: 4 99 2 50.0
# 3: 49 85 3 46.7
# 4: 61 4 4 57.1
请参见?:=
:
DT [i,(colvector):= val]
[...现在使用首选语法。括号足以阻止LHS成为符号。与 c(colvector)
[...] NOW PREFERRED [...] syntax. The parens are enough to stop the LHS being a symbol; same as c(colvector)
原始答案:
您在正确的轨道上:构造一个表达式,以在对 [。data.table
是执行这种操作的 data.table 方法。再说一点,为什么不构造一个以 entire j
为参数的表达式(而不只是其左手)?
You were on exactly the right track: constructing an expression to be evaluated within the call to [.data.table
is the data.table way to do this sort of thing. Going just a bit further, why not construct an expression that evaluates to the entire j
argument (rather than just its left hand side)?
类似这样的方法应该可以解决问题:
Something like this should do the trick:
## Your code so far
library(data.table)
test_dtb <- data.table(a=sample(1:100, 100),b=sample(1:100, 100),id=rep(1:10,10))
cn <- "blah"
## One solution
expr <- parse(text = paste0(cn, ":=mean(a)"))
test_dtb[,eval(expr), by=id]
## Checking the result
head(test_dtb, 4)
# a b id blah
# 1: 30 26 1 38.4
# 2: 83 82 2 47.4
# 3: 47 66 3 39.5
# 4: 87 23 4 65.2
这篇关于data.table中的动态列名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!