根据R的严格范围绘制随机样本而不进行替换 [英] Draw a random sample without replacement based on a strict range in R

查看:47
本文介绍了根据R的严格范围绘制随机样本而不进行替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试绘制随机的行样本,而不从数据集中进行替换,以使样本中的列总和应严格在一定范围内.对于示例数据集 mtcars ,随机样本应确保 mpg 的总和严格在90-100之间.

I'm trying to draw a random sample of rows without replacement from a dataset such that the sum of a column in the sample should be strictly within a range. For the example dataset mtcars, the random sample should be such that the sum of mpg is strictly within 90-100.

可复制的示例:

data("mtcars")

random_sample <- function(dataset){
  final_mpg = 0
  while (final_mpg < 100) {
    basic_dat <- dataset %>%
      sample_n(1) %>%
      ungroup()
    total_mpg <- basic_dat %>%
      summarise(mpg = sum(mpg)) %>%
      pull(mpg)
    final_mpg <- final_mpg + total_mpg
    if (final_mpg > 90 & final_mpg < 100){
      break()
    }
    final_dat <- rbind(get0("final_dat"), get0("basic_dat"))
  }
  return(final_dat)
}

chosen_sample <- random_sample(mtcars)

但是此函数输出的样本具有 sum(mpg)>100 .如何确保其生成的每个样本都严格在该范围内?非常感谢您的帮助.

But this function output samples with sum(mpg) > 100. How do I ensure that every sample it generates is strictly within that range? Any help is much appreciated.

推荐答案

这是有效的.由于mpg的值,它不能超过90.

This is working. because of the values of mpg, it couldn't get more than 90.

ransmpl <- function(df) { 
  s1<- df[sample(rownames(df),1),] 
  s11 <- sum(s1$mpg) 
  while(s11<100){
    rn2<- rownames(df[!(rownames(df) %in% rownames(s1)),]) 
    nr<- df[sample(rn2,1),] 
    s11 <- sum(rbind(s1,nr)$mpg) 
    if(s11>100){ 
      break() 
    } 
    s1<-rbind(s1,nr) 
  } 
  return(s1) 
  }


chosen_sample <- ransmpl(mtcars)
chosen_sample

输出

> chosen_sample
                   mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Merc 280C         17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
Chrysler Imperial 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4

> sum(chosen_sample$mpg)
[1] 95.1

这篇关于根据R的严格范围绘制随机样本而不进行替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆