R:制作2个子集向量,以使值在索引方向上不同,并且在每个向量上也不同 [英] R: make 2 subset vectors so that values are different index-wise, and also different across each vector

查看:92
本文介绍了R:制作2个子集向量,以使值在索引方向上不同,并且在每个向量上也不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

紧跟此问题,我想做类似的事情,但是这次我还有一个要求。



我想从相同数据中设置2个向量子集。

p>

我需要将替换设置为 FALSE ,因为我需要 a 中的所有值都不同,并且 b 中的所有值都不同。



除此之外, a b 中的值不能相同



请注意,采样向量 v 总是固定的,样本长度 l



执行以下操作,我仅满足一个条件( a 和 b 中的值不同,但是在 a 之间的同一索引中的值仍然b 可以相同)

 > set.seed(1)
> v<-1:15
> l<-10
> a<-sample(v,l,replace = F)
> b<-sample(v,l,replace = F)
> a
[1] 4 6 8 11 3 9 10 14 5 1
> b
[1] 4 3 9 5 13 12 7 8 14 10
> a == b
[1]否否否否否否否否否否否否否否执行以下操作(来自上一个答案问题),我只能满足其他条件(同一索引中的值不相同,但是在 a b
)。

 > ab<-split(replicate(10,sample(15,2)),seq(2))
> a<-ab [[1]]
> b<-ab [[2]]
> a
[1] 14 7 10 8 2 6 6 8 13 12
> b
[1] 5 5 4 11 13 12 5 13 6 14
>重复(a)
[1]否否否否否否否否是TRUE TRUE FALSE FALSE
>重复(b)
[1]否是否否否否否否否是否否否

试图打破这两种方法有帮助吗?谢谢!

解决方案

使用 while 循环进行编辑以进行重新采样,直到



知道了...我所做的就是检查 b 对于相同的索引等于 a ,但不相等。



对于相同的索引,我从等于 v 的向量中重新采样,但是在 b 中没有已经接受的值(不等于 a 表示相同的索引)。



就像这样:

 > set.seed(6)
> v<-1:15
> l<-10
> a<-sample(v,l,replace = F)
> b<-sample(v,l,replace = F)
> a
[1] 10 14 4 5 9 15 11 7 13 1
> b
[1] 10 13 2 4 9 3 5 6 14 11
> a == b
[1]是否否否否是否否否否否否


> if(any(a == b)== TRUE){
+ b0<-b [which(a == b)]
+ b2<-b [which(a!= b )]
+ vnew<-v [which(!(v%in%b2))]
+ b0<-sapply(b0,function(x)sample(vnew [vnew!= x ],1))
+ while(any(duplicated(b0))== TRUE){
+ b0<-sapply(b0,function(x)sample(vnew [vnew!= x] ,1))
+}
+ b [which(a == b)]<-b0
+}


> a
[1] 10 14 4 5 9 15 11 7 13 1
> b
[1] 15 13 2 4 12 3 5 6 14 11
> a == b
[1]假假假假假假假假假假假假
>重复(b)
[1]否否否否否否否否否否否否否


Following up on this question, I want to do something similar, but this time I have one more requirement.

I want to make 2 vectors subsetting from the same data.

I need replace to be set to FALSE because I need all values to be different across a, and all values to be different across b.

Apart from that, values cannot be the same in a and b for the same index position.

Note that sampling vector v is always fixed, as is the sample length l.

Doing the following, I only fulfil one criterium (values across a and values across b are different, but still values in the same index between a and b can be identical)

> set.seed(1)
> v <- 1:15
> l <- 10
> a <- sample(v, l, replace=F)
> b <- sample(v, l, replace=F)
> a
 [1]  4  6  8 11  3  9 10 14  5  1
> b
 [1]  4  3  9  5 13 12  7  8 14 10
> a==b
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Doing the following (answer from the previous question), I only fulfil the other criterium (values in the same index are not identical, but there can be identical values across a or b).

> ab <- split(replicate(10, sample(15,2)), seq(2))
> a <- ab[[1]]
> b <- ab[[2]]
> a
 [1] 14  7 10  8  2  6  6  8 13 12
> b
 [1]  5  5  4 11 13 12  5 13  6 14
> duplicated(a)
 [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
> duplicated(b)
 [1] FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE

Any help trying to collapse both approaches? Thanks!

解决方案

Edited with a while loop to do the resampling until it is correct (with no duplicates).

Got it... What I did was check which values in b are equal to a for the same index, and which are not equal.

For those that are equal, I resample from a vector equal to v but without the already "accepted" values in b (the ones not equal to a for the same index).

Like this:

> set.seed(6)
> v <- 1:15
> l <- 10
> a <- sample(v, l, replace=F)
> b <- sample(v, l, replace=F)
> a
 [1] 10 14  4  5  9 15 11  7 13  1
> b
 [1] 10 13  2  4  9  3  5  6 14 11
> a==b
 [1]  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE


> if (any(a==b)==TRUE) {
+   b0 <- b[which(a==b)]
+   b2 <- b[which(a!=b)]
+   vnew <- v[which(!(v %in% b2))]
+   b0 <- sapply(b0, function(x) sample(vnew[vnew != x], 1))
+   while (any(duplicated(b0))==TRUE){
+     b0 <- sapply(b0, function(x) sample(vnew[vnew != x], 1))
+   }
+   b[which(a==b)] <- b0
+ }


> a
 [1] 10 14  4  5  9 15 11  7 13  1
> b
 [1] 15 13  2  4 12  3  5  6 14 11
> a==b
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> duplicated(b)
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

这篇关于R:制作2个子集向量,以使值在索引方向上不同,并且在每个向量上也不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆