在R中的循环内使用开始和结束值进行采样 [英] Sample using start and end values within a loop in R

查看:85
本文介绍了在R中的循环内使用开始和结束值进行采样的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在R的较大循环中对值范围进行采样.随着循环前进到 j 的每一行,我想对在 start 列和 end 列中给出的值,将该值放在该行的 sampled 列中.

I am trying to sample between a range of values as part of a larger loop in R. As the loop progresses to each row j, I want to sample a number between the value given in the start column and the value given in the end column, placing that value in the sampled column for that row.

结果应如下所示:

ID  start  end  sampled
a   25     67   44
b   36     97   67
c   23     85   77
d   15     67   52
e   21     52   41
f   43     72   66
g   39     55   49
h   27     62   35
i   11     99   17
j   21     89   66
k   28     65   48
l   44     58   48
m   16     77   22
n   25     88   65

我开始使用 mapply 对整个df进行采样,但是随后我试图将所有15个采样值都放入一行.

I started using mapply, which samples the whole df, but then I'm trying to fit all 15 sampled values into a single row.

df [j,4]<-mapply(function(x,y)sample(seq(x,y),1),df $ start,df $ end)

我认为使用 seq 的某些方法可能会起作用,但这会导致错误,指出 from 的长度必须为1.

I thought maybe something using seq might work, but this results in errors saying that from must be of length 1.

df [j,4]<-sample(seq(df $ start,df $ end),1,replace = TRUE)

外部循环结构非常复杂,因此我在这里没有包括它,但是代码的 df [j,4] 部分是必需的,因为它是较大循环的一部分.在某些情况下,必须根据实际数据集中的其他依赖项对行进行重新采样.例如, a 的采样值可能需要大于 b .其余代码更新采样列,检查依赖关系,如果不满足依赖关系,将重新运行示例.如果我可以使该采样部分正常工作,那么应该可以插入它,而不会带来太多麻烦(我希望如此).

The outer looping structure is pretty complicated so I haven't included it here, but the df[j,4] part of the code is necessary because it is part of a larger loop. There are situations where rows have to be resampled based on additional dependencies in the actual dataset. For example, the sampled value of a might need to be larger than b. The rest of the code updates the sampled column, checks for dependencies, and will rerun the sample if the dependencies aren't met. If I can get this sampling section to work, I should be able to plug it in without too much trouble (I hope).

这是一个示例数据集.

structure(list(ID = c("a", "b", "c", "d", "e", "f", "g", "h", 
"i", "j", "k", "l", "m", "n"), start = c(25, 36, 23, 15, 21, 
43, 39, 27, 11, 21, 28, 44, 16, 25), end = c(67, 97, 85, 67, 
52, 72, 55, 62, 99, 89, 65, 58, 77, 88), sampled = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -14L), spec = structure(list(
    cols = list(ID = structure(list(), class = c("collector_character", 
    "collector")), start = structure(list(), class = c("collector_double", 
    "collector")), end = structure(list(), class = c("collector_double", 
    "collector")), sampled = structure(list(), class = c("collector_logical", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"))```

推荐答案

弄清楚了. df [j,4]<-mapply(function(x,y)sample(seq(x,y),1),df [j,"start"],df [j,"end"])

我只需要具体说明要输入到 df [j,4] 中的采样值的哪一行.为列 start end 的列指定 j 行就可以了.

I just needed to be specific as to which row of the sampled values I wanted to enter into df[j,4]. Specifying row j for columns start and end did the trick.

这篇关于在R中的循环内使用开始和结束值进行采样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆