在添加向量之后,从现有矩阵中随机选择值(在R中) [英] Randomly selecting values from an existing matrix after adding a vector (in R)

查看:66
本文介绍了在添加向量之后,从现有矩阵中随机选择值(在R中)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

非常感谢您的提前帮助!

Thank you so much for your help in advance!

我正在尝试修改现有的矩阵,以便在将新行添加到矩阵中时,它会删除先前存在的矩阵中的值.

I am trying to modify an existing matrix such that, when a new line is added to the matrix, it removes values from the preexisting matrix.

例如,我有矩阵:

[,1] [,2] [,3] [,4]
 1     1    0    0
 0     1    0    0
 1     0    1    0
 0     0    1    1

我想添加另一个向量I.vec,它具有两个值(I.vec=c(0,1,1,0)). 这很容易做到.我只是将其绑定到矩阵中. 现在,对于I.vec等于1的每一列,我想从其他行中随机选择一个值并将其设为零. 理想情况下,它将以如下矩阵结尾:

I want to add another vector, I.vec, which has two values (I.vec=c(0,1,1,0)). This is easy enough to do. I just rbind it to the matrix. Now, for every column where I.vec is equal to 1, I want to randomly select a value from the other rows and make it zero. Ideally, this would end up with a matrix like:

[,1] [,2] [,3] [,4]
 1     0    0    0
 0     1    0    0
 1     0    0    0
 0     0    1    1
 0     1    1    0

但是每次我运行迭代时,我都希望它再次随机采样.

But each time I run the iteration, I want it to randomly sample again.

这就是我尝试过的:

mat1<-matrix(c(1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,1),byrow=T, nrow=4)
I.vec<-c(0,1,1,0)
mat.I<-rbind(mat1,I.vec)
mat.I.r<-mat.I
d1<-mat.I[,which(mat.I[5,]==1)]
mat.I.r[sample(which(d1[1:4]==1),1),which(mat.I[5,]==1)]<-0

但这只会删除我要删除的两个值之一.我还尝试过对子集进行变体设置,但是没有成功.

But this only deletes one of the two values I would like to delete. I have also tried variations on subsetting the matrix, but I have not been successful.

再次感谢您!

推荐答案

OP的描述中有些含糊,因此建议两种解决方案:

There is a little bit of ambiguity in the description from the OP, so two solutions are suggested:

我将更改原始功能(见下文).所做的更改是定义rows的行.我现在拥有了(原始版本中有一个错误-修订了以下版本以处理该错误):

I'll just alter the original function (see below). The change is to the line defining rows. I now have (there was a bug in the original - the version below is revised to handle deal with the bug):

rows <- sapply(seq_along(cols), 
                   function(x, mat, cols) {
                       ones <- which(mat[,cols[x]] == 1L)
                       out <- if(length(ones) == 1L) {
                                  ones
                              } else {
                                  sample(ones, 1)
                       }
                       out
                   }, mat = mat, cols = cols)

基本上,它的作用是,对于每一列,我们都需要将1交换为0,我们可以计算出该列的哪些行包含1 s并对其进行采样.

Basically, what this does is, for each column we need to swap a 1 to a 0, we work out which rows of the column contain 1s and sample one of these.

编辑:我们必须处理一列中只有一个1的情况.如果仅从长度为1的向量采样,则R的sample()会将其视为要从集合seq_len(n)而不是从长度为1的集合n进行采样.我们现在使用if, else语句来处理这个问题.

Edit: We have to handle the case where there is only a single 1 in a column. If we just sample from a length 1 vector, R's sample() will treat it as if we wanted to sample from the set seq_len(n) not from the length 1 set n. We handle this now with an if, else statement.

我们必须为每一列单独执行此操作,以便获得正确的行.我想我们可以做一些很好的操作来避免重复调用which()sample(),但是此刻我却不为所动,因为我们必须处理该列中只有一个1的情况.这是完成的功能(已更新为可处理原始代码中的第1个示例错误):

We have to do this individually for each column so we get the correct rows. I suppose we could do some nice manipulation to avoid repeated calls to which() and sample(), but how escapes me at the moment, because we do have to handle the case where there is only one 1 in the column. Here's the finished function (updated to handle the length 1 sample bug in the original):

foo <- function(mat, vec) {
    nr <- nrow(mat)
    nc <- ncol(mat)

    cols <- which(vec == 1L)
    rows <- sapply(seq_along(cols), 
                   function(x, mat, cols) {
                       ones <- which(mat[,cols[x]] == 1L)
                       out <- if(length(ones) == 1L) {
                                  ones
                              } else {
                                  sample(ones, 1)
                              }
                       out
                   }, mat = mat, cols = cols)

    ind <- (nr*(cols-1)) + rows
    mat[ind] <- 0

    mat <- rbind(mat, vec)
    rownames(mat) <- NULL

    mat
}

并且它正在起作用:

> set.seed(2)
> foo(mat1, ivec)
     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    1    1    0

,并且当我们要在其中进行交换的列中只有一个1时,它会起作用:

and it works when there is only one 1 in a column we want to do a swap in:

> foo(mat1, c(0,0,1,1))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    0    1    1

原始答案:假设相关列中的任何值可以设置为零

这是向量化的答案,在进行替换时,我们将矩阵视为向量.使用示例数据:

Original Answer: Assuming any value in a relevant column can be set to zero

Here is a vectorised answer, where we treat the matrix as a vector when doing the replacement. Using the example data:

mat1 <- matrix(c(1,1,0,0,0,1,0,0,1,0,1,0,0,0,1,1), byrow = TRUE, nrow = 4)
ivec <- c(0,1,1,0)

## Set a seed to make reproducible
set.seed(2)

## number of rows and columns of our matrix
nr <- nrow(mat1)
nc <- ncol(mat1)

## which of ivec are 1L
cols <- which(ivec == 1L)

## sample length(cols) row indices, with replacement
## so same row can be drawn more than once
rows <- sample(seq_len(nr), length(cols), replace = TRUE)

## Compute the index of each rows cols combination
## if we treated mat1 as a vector
ind <- (nr*(cols-1)) + rows
## ind should be of length length(cols)

## copy for illustration
mat2 <- mat1

## replace the indices we want with 0, note sub-setting as a vector
mat2[ind] <- 0

## bind on ivec
mat2 <- rbind(mat2, ivec)

这给了我们

> mat2
     [,1] [,2] [,3] [,4]
        1    0    0    0
        0    1    0    0
        1    0    0    0
        0    0    1    1
ivec    0    1    1    0

如果我这样做不止一次或两次,我会将其包装在一个函数中:

If I were doing this more than once or twice, I'd wrap this in a function:

foo <- function(mat, vec) {
    nr <- nrow(mat)
    nc <- ncol(mat)

    cols <- which(vec == 1L)
    rows <- sample(seq_len(nr), length(cols), replace = TRUE)

    ind <- (nr*(cols-1)) + rows
    mat[ind] <- 0

    mat <- rbind(mat, vec)
    rownames(mat) <- NULL

    mat
}

哪个给:

> foo(mat1, ivec)
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    0    1    0    0
[3,]    1    0    1    0
[4,]    0    0    0    1
[5,]    0    1    1    0

如果您想对多个ivec执行此操作,并且每次增长mat1,那么您可能不想循环执行此操作,因为增长的对象很慢(它涉及副本等).但是您可以修改ind的定义,以包括为n ivec s绑定的额外的n行.

If you wanted to do this for multiple ivecs, growing mat1 each time, then you probably don't want to do that in a loop as growing objects is slow (it involves copies etc). But you could just modify the definition of ind to include the extra n rows you bind on for the n ivecs.

这篇关于在添加向量之后,从现有矩阵中随机选择值(在R中)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆