分层抽样，组大小低于 R 中的样本大小 [英] stratified sampling with group size below sample size in R

查看：55 发布时间：2021/7/14 20:02:18 r sampling

本文介绍了分层抽样，组大小低于 R 中的样本大小的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下格式的市场响应数据:

I have response data by market in the format:

head(df)
    ID  market  q1  q2
    470 France  1   3
    625 Germany 0   2
    155 Italy   1   6
    648 Spain   0   5
    862 France  1   7
    699 Germany 0   8
    460 Italy   1   6
    333 Spain   1   5
    776 Spain   1   4

以及以下频率:

 table(df$market)
    France  140
    Germany 300
    Italy   50
    Spain   75

我需要创建一个数据框，其中包含每个市场 100 个响应的样本，并且在少于 100 个响应的情况下所有响应均不替换.

I need to create a data frame with a sample of 100 responses per market, and all responses without replacement in cases when there's less than 100 of them.

所以

table(df_new$market)
        France  100
        Germany 100
        Italy   50
        Spain   75

提前致谢！

推荐答案

以下看起来有效:

set.seed(10); DF = data.frame(c1 = sample(LETTERS[1:4], 25, T), c2 = runif(25))
freqs = as.data.frame(table(DF$c1))
freqs$ss = ifelse(freqs$Freq >= 5, 5, freqs$Freq)
#> freqs
#  Var1 Freq ss
#1    A    4  4
#2    B   11  5
#3    C    7  5
#4    D    3  3
res = mapply(function(x, y) DF[sample(which(DF$c1 %in% x), y), ], 
             x = freqs$Var1, y = freqs$ss, SIMPLIFY = F)
do.call(rbind, res)
#   c1        c2
#5   A 0.3558977
#17  A 0.2289039
#6   A 0.5355970
#13  A 0.9546536
#3   B 0.2395891
#25  B 0.8015470
#10  B 0.4226376
#15  B 0.5005032
#19  B 0.7289646
#11  C 0.7477465
#9   C 0.8998325
#12  C 0.8226526
#1   C 0.7066469
#4   C 0.7707715
#23  D 0.4861003
#20  D 0.2498805
#21  D 0.1611833

这篇关于分层抽样，组大小低于 R 中的样本大小的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

分层抽样，组大小低于 R 中的样本大小 [英] stratified sampling with group size below sample size in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

分层抽样，组大小低于 R 中的样本大小 [英] stratified sampling with group size below sample size in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭