R按大组数据的引导统计信息 [英] R bootstrap statistics by group for big data
问题描述
我要引导其中包含组的数据集。一个简单的情况是引导简单的方法:
data < - as.data.table(list(x1 = runif ),x2 = runif(200),group = runif(200)> 0.5))
stat < mean(x2)),by =group]}
boot(data,stat,R = 10)
b $ b
这会给我错误矩阵
上的下标数不正确,因为 by =group
部分。我设法使用子集化解决它,但不喜欢这个解决方案。是否有更简单的方法使这种任务工作?
特别是,我想在统计函数中引入一个额外的参数,如 stat(groupvar = group),R = 100)
,并将它传递给启动函数>
这应该可以:
data [,list(list(boot(.SD,stat,R = 10))),by = group] $ V1
>
I want to bootstrap a data set that has groups in it. A simple scenario would be bootstrapping simple means:
data <- as.data.table(list(x1 = runif(200), x2 = runif(200), group = runif(200)>0.5)) stat <- function(x, i) {x[i, c(m1 = mean(x1), m2 = mean(x2)), by = "group"]} boot(data, stat, R = 10)
This gives me the error
incorrect number of subscripts on matrix
, because ofby = "group"
part. I managed to solve it using subsetting, but don't like this solution. Is there simpler way to make this kind of task work?In particular, I'd like to introduce an additional argument in the statistics function like
stat(x, i, groupvar)
and pass it to the boot function likeboot(data, stat(groupvar = group), R = 100)
?解决方案This should do it:
data[, list(list(boot(.SD, stat, R = 10))), by = group]$V1
这篇关于R按大组数据的引导统计信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!