R按大组数据的引导统计信息 [英] R bootstrap statistics by group for big data

查看:147
本文介绍了R按大组数据的引导统计信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要引导其中包含组的数据集。一个简单的情况是引导简单的方法:

  data < -  as.data.table(list(x1 = runif ),x2 = runif(200),group = runif(200)> 0.5))
stat < mean(x2)),by =group]}
boot(data,stat,R = 10)


b $ b

这会给我错误矩阵上的下标数不正确,因为 by =group部分。我设法使用子集化解决它,但不喜欢这个解决方案。是否有更简单的方法使这种任务工作?



特别是,我想在统计函数中引入一个额外的参数,如 stat(groupvar = group),R = 100),并将它传递给启动函数>

解决方案

这应该可以:

  data [,list(list(boot(.SD,stat,R = 10))),by = group] $ V1 

I want to bootstrap a data set that has groups in it. A simple scenario would be bootstrapping simple means:

data <- as.data.table(list(x1 = runif(200), x2 = runif(200), group = runif(200)>0.5))
stat <- function(x, i) {x[i, c(m1 = mean(x1), m2 = mean(x2)), by = "group"]}
boot(data, stat, R = 10)

This gives me the error incorrect number of subscripts on matrix, because of by = "group" part. I managed to solve it using subsetting, but don't like this solution. Is there simpler way to make this kind of task work?

In particular, I'd like to introduce an additional argument in the statistics function like stat(x, i, groupvar) and pass it to the boot function like boot(data, stat(groupvar = group), R = 100)?

解决方案

This should do it:

data[, list(list(boot(.SD, stat, R = 10))), by = group]$V1

这篇关于R按大组数据的引导统计信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆