在R中的循环内使用开始和结束值进行采样 [英] Sample using start and end values within a loop in R
问题描述
我正在尝试在R的较大循环中对值范围进行采样.随着循环前进到 j
的每一行,我想对在 start
列和 end
列中给出的值,将该值放在该行的 sampled
列中.
I am trying to sample between a range of values as part of a larger loop in R. As the loop progresses to each row j
, I want to sample a number between the value given in the start
column and the value given in the end
column, placing that value in the sampled
column for that row.
结果应如下所示:
ID start end sampled
a 25 67 44
b 36 97 67
c 23 85 77
d 15 67 52
e 21 52 41
f 43 72 66
g 39 55 49
h 27 62 35
i 11 99 17
j 21 89 66
k 28 65 48
l 44 58 48
m 16 77 22
n 25 88 65
我开始使用 mapply
对整个df进行采样,但是随后我试图将所有15个采样值都放入一行.
I started using mapply
, which samples the whole df, but then I'm trying to fit all 15 sampled values into a single row.
df [j,4]<-mapply(function(x,y)sample(seq(x,y),1),df $ start,df $ end)
我认为使用 seq
的某些方法可能会起作用,但这会导致错误,指出 from
的长度必须为1.
I thought maybe something using seq
might work, but this results in errors saying that from
must be of length 1.
df [j,4]<-sample(seq(df $ start,df $ end),1,replace = TRUE)
外部循环结构非常复杂,因此我在这里没有包括它,但是代码的 df [j,4]
部分是必需的,因为它是较大循环的一部分.在某些情况下,必须根据实际数据集中的其他依赖项对行进行重新采样.例如, a
的采样值可能需要大于 b
.其余代码更新采样列,检查依赖关系,如果不满足依赖关系,将重新运行示例.如果我可以使该采样部分正常工作,那么应该可以插入它,而不会带来太多麻烦(我希望如此).
The outer looping structure is pretty complicated so I haven't included it here, but the df[j,4]
part of the code is necessary because it is part of a larger loop. There are situations where rows have to be resampled based on additional dependencies in the actual dataset. For example, the sampled value of a
might need to be larger than b
. The rest of the code updates the sampled column, checks for dependencies, and will rerun the sample if the dependencies aren't met. If I can get this sampling section to work, I should be able to plug it in without too much trouble (I hope).
这是一个示例数据集.
structure(list(ID = c("a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n"), start = c(25, 36, 23, 15, 21,
43, 39, 27, 11, 21, 28, 44, 16, 25), end = c(67, 97, 85, 67,
52, 72, 55, 62, 99, 89, 65, 58, 77, 88), sampled = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -14L), spec = structure(list(
cols = list(ID = structure(list(), class = c("collector_character",
"collector")), start = structure(list(), class = c("collector_double",
"collector")), end = structure(list(), class = c("collector_double",
"collector")), sampled = structure(list(), class = c("collector_logical",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))```
推荐答案
弄清楚了. df [j,4]<-mapply(function(x,y)sample(seq(x,y),1),df [j,"start"],df [j,"end"])
我只需要具体说明要输入到 df [j,4]
中的采样值的哪一行.为列 start
和 end
的列指定 j
行就可以了.
I just needed to be specific as to which row of the sampled values I wanted to enter into df[j,4]
. Specifying row j
for columns start
and end
did the trick.
这篇关于在R中的循环内使用开始和结束值进行采样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!