随机观察组(块) [英] randomize observations by groups (blocks)
问题描述
我有一个包含 I
的数据框,每个观察结果属于 g
类别之一。
I have a data frame with I
obsevations, and each observation belongs to one of g
categories.
set.seed(9782)
I <- 500
g <- 10
library(dplyr)
anon_id <- function(n = 1, lenght = 12) {
randomString <- c(1:n)
for (i in 1:n)
{
randomString[i] <- paste(sample(c(0:9, letters, LETTERS),
lenght, replace = TRUE),
collapse = "")
}
return(randomString)
}
df <- data.frame(id = anon_id(n = I, lenght = 16),
group = sample(1:g, I, T))
给出一些概率 p
的向量,将每个观察值随机分配给 J
瓮之一。那就是分配给urn的概率J = 1是p [1]。增加的复杂性是我想要阻止这个块。
I want to randomly assign each observation to one of J
"urns", given some vector of probabilities p
. That is the probability of being assign to urn J=1 is p[1]. The added complexity is that I want to do this block by block.
如果我忽略块,我可以轻松地做到这一点:
If I ignore the blocks, I can do this easily:
J <- 3
p <- c(0.25, 0.5, 0.25)
df1 <- df %>% mutate(urn = sample(x = c(1:J), size = I, replace = T, prob = p))
我想过这个方法可以通过block来实现
I thought about this method to do it by "block"
# Block randomization
randomize_block <- function(g) {
df1 <- df %>% filter(group==g)
size <- nrow(df1)
df1 <- df1 %>% mutate(urn = sample(x = c(1:J),
size = size,
replace = T,
prob = p))
return(df1)
}
df2 <- lapply(1:g, randomize_block)
df2 <- data.table::rbindlist(df2)
有没有更好的方法?
推荐答案
不确定如果这更好,但这里是一个基础R技术与data.frame df,具有组名group以及瓮分配 1:J
具有分配概率矢量p的长度为J。
Not sure if this is better, but here is a base R technique with data.frame df, that has group name "group" as well as urn assignments 1:J
with assignment probabilities in vector p of length J.
# get urn assignment
urnAssignment <- lapply(unique(df$group),
function(i) sample(1:J, nrow(df[group==i,]), replace =T, prob=p))
# get a list that collects position of observations
obsOrder <- lapply(unique(df$group),
function(i) which(df$group == i))
df$urnAssignment <- unlist(urnAssignment)[unlist(obsOrder)]
这篇关于随机观察组(块)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!