data.table中的动态列名 [英] dynamic column names in data.table, R
问题描述
我想向我的 data.table
添加列,其中名称是动态的。另外,我需要在添加这些列时使用 by
参数。例如:
test_dtb < - data.table(a = sample(1:100,100),b = sample 1:100,100),id = rep(1:10,10))
cn < - parse(text =blah)
test_dtb [,eval(cn):= mean ),by = id]
在`[.data.table`(test_dtb,,`:=`(eval(cn),mean(a)),by = id) $ b LHS::当with = TRUE时,必须是单个列名称。当with = FALSE时,LHS可以是列名称或位置的向量。
另一个尝试:
cn < - blah
test_dtb [,cn:= mean(a),by = id,with = FALSE]
错误在`.data.table` test_dtb,`:=`(cn,mean(a)),by = id,with = FALSE):当提供'by'或'keyby'时,'with'必须为TRUE
来自马修的更新:
现在可以在R-Forge的v1.8.3中使用。感谢您的高亮显示!
有关新示例,请参阅此类似问题:
解决方案更新2016-11-29
现在,您可以这样做:
##`(cn)`(或`eval(cn)`)在执行`:=`之前被评估为blah
test_dtb [,(cn):= mean(a),by = id]
head(test_dtb,4)
#ab id blah
#1:41 19 1 54.2
#2:4 99 2 50.0
#3:49 85 3 46.7
#4:61 4 4 57.1
原始答案:
在正确的轨道上:在
[。data.table
的调用中构造要被求值的表达式是 data.table 那类的东西。更进一步,为什么不构造一个计算整个j
参数(而不仅仅是它的左手边)的表达式?
这样的东西应该能做到:
##你的代码far
library(data.table)
test_dtb< - data.table(a = sample(1:100,100),b = sample(1:100,100),id = rep :10,10))
cn < - blah
##一个解决方案
expr < - parse(text = paste0(cn,:= mean a)))
test_dtb [,eval(expr),by = id]
##检查结果
head(test_dtb,4)
# id blah
#1:30 26 1 38.4
#2:83 82 2 47.4
#3:47 66 3 39.5
#4:87 23 4 65.2
I am trying to add columns to my
data.table
, where the names are dynamic. I addition I need to use theby
argument when adding these columns. For example:test_dtb <- data.table(a=sample(1:100, 100), b=sample(1:100, 100), id=rep(1:10,10)) cn <- parse(text="blah") test_dtb[,eval(cn):=mean(a), by=id] Error in `[.data.table`(test_dtb, , `:=`(eval(cn), mean(a)), by = id) : LHS of := must be a single column name when with=TRUE. When with=FALSE the LHS may be a vector of column names or positions.
Another attempt:
cn <- "blah" test_dtb[,cn:=mean(a), by=id, with=FALSE] Error in `[.data.table`(test_dtb, , `:=`(cn, mean(a)), by = id, with = FALSE) : 'with' must be TRUE when 'by' or 'keyby' is provided
Update from Matthew:
This now works in v1.8.3 on R-Forge. Thanks for highlighting!
See this similar question for new examples:Assign multiple columns using data.table, by group
解决方案Updated of 2016-11-29
Nowadays, you can just do this:
## `(cn)` (or `eval(cn)`) gets evaluated to "blah" before `:=` is carried out test_dtb[, (cn):=mean(a), by=id] head(test_dtb, 4) # a b id blah # 1: 41 19 1 54.2 # 2: 4 99 2 50.0 # 3: 49 85 3 46.7 # 4: 61 4 4 57.1
Original answer:
You were on exactly the right track: constructing an expression to be evaluated within the call to
[.data.table
is the data.table way to do this sort of thing. Going just a bit further, why not construct an expression that evaluates to the entirej
argument (rather than just its left hand side)?Something like this should do the trick:
## Your code so far library(data.table) test_dtb <- data.table(a=sample(1:100, 100),b=sample(1:100, 100),id=rep(1:10,10)) cn <- "blah" ## One solution expr <- parse(text = paste0(cn, ":=mean(a)")) test_dtb[,eval(expr), by=id] ## Checking the result head(test_dtb, 4) # a b id blah # 1: 30 26 1 38.4 # 2: 83 82 2 47.4 # 3: 47 66 3 39.5 # 4: 87 23 4 65.2
这篇关于data.table中的动态列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!