随机观察组(块) [英] randomize observations by groups (blocks)

查看:155
本文介绍了随机观察组(块)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 I 的数据框,每个观察结果属于 g 类别之一。

I have a data frame with I obsevations, and each observation belongs to one of g categories.

set.seed(9782)
I <- 500
g <- 10
library(dplyr)

anon_id <- function(n = 1, lenght = 12) {
  randomString <- c(1:n)
  for (i in 1:n)
  {
    randomString[i] <- paste(sample(c(0:9, letters, LETTERS),
                                    lenght, replace = TRUE),
                             collapse = "")
  }
  return(randomString)
}

df <- data.frame(id = anon_id(n = I, lenght = 16),
                 group = sample(1:g, I, T))

给出一些概率 p 的向量,将每个观察值随机分配给 J 瓮之一。那就是分配给urn的概率J = 1是p [1]。增加的复杂性是我想要阻止这个块。

I want to randomly assign each observation to one of J "urns", given some vector of probabilities p. That is the probability of being assign to urn J=1 is p[1]. The added complexity is that I want to do this block by block.

如果我忽略块,我可以轻松地做到这一点:

If I ignore the blocks, I can do this easily:

J <- 3
p <- c(0.25, 0.5, 0.25)
df1 <- df %>% mutate(urn = sample(x = c(1:J), size = I, replace = T, prob = p))

我想过这个方法可以通过block来实现

I thought about this method to do it by "block"

# Block randomization
randomize_block <- function(g) {
  df1 <- df %>% filter(group==g) 
  size <- nrow(df1)
  df1 <- df1 %>% mutate(urn = sample(x = c(1:J), 
                                     size = size, 
                                     replace = T, 
                                     prob = p))
  return(df1)

}

df2 <- lapply(1:g, randomize_block)
df2 <- data.table::rbindlist(df2)

有没有更好的方法?

推荐答案

不确定如果这更好,但这里是一个基础R技术与data.frame df,具有组名group以及瓮分配 1:J 具有分配概率矢量p的长度为J。

Not sure if this is better, but here is a base R technique with data.frame df, that has group name "group" as well as urn assignments 1:J with assignment probabilities in vector p of length J.

# get urn assignment
urnAssignment  <- lapply(unique(df$group), 
                    function(i) sample(1:J, nrow(df[group==i,]), replace =T, prob=p))

# get a list that collects position of observations
obsOrder  <- lapply(unique(df$group), 
                    function(i) which(df$group == i))

df$urnAssignment <- unlist(urnAssignment)[unlist(obsOrder)]

这篇关于随机观察组(块)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆