检索新添加的行的索引 - for循环在R中 [英] Retrieve index of newly added row - for loop in R

查看:148
本文介绍了检索新添加的行的索引 - for循环在R中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



从头开始,我有一个矩阵列表,其中包含p值,每个值都有可变数量的行和列。这是因为并不是所有的小组都有足够数量的治疗个体进行t检验。以下是打印到控制台,当我访问此示例列表:

  $组1 
正常治疗1治疗2
治疗1 1不适用不适用b $ b治疗2 1 1不适用b $ b治疗3 1 1 1

$ 2组b $ b正常治疗2
治疗2 1 NA
治疗4 1 1

我希望每个小组拥有相同数量的行和列,按照正确的顺序,缺少的值只是用NAs填充。这是我想要的一个样本:

pre $ $ $ c $ $ $ Group $
正常治疗1治疗2治疗3
治疗1 1不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用b正常治疗1治疗2治疗3
治疗1不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用$ b

以下是我到目前为止的代码:

 ˚F (i,factor){
如果(!i%in%rownames(results.matrix)){
results.matrix< - rbind(results.matrix,NA)
rownames(results.matrix)[num]< - i
}
num < - num + 1
>
rownames(results.matrix)< - results.matrix [rownames(factors),, drop = FALSE]
return(results.matrix)
}

在上面的函数中,x是我的矩阵列表和因素将按照我想要的顺序列出所有的因素。我有一个类似的功能添加列。

我的问题,正如我所看到的,是在第2组。如果它看到我失踪治疗1,它会用rowname处理1替换rowname处理2,所以处理2的数据现在被错误标记为处理1.然后按照我想要的方式重新排序变量,但是数据已经被错误地标记了!

如果我可以访问新增行的索引,这个索引从一个组到另一个组发生变化,那么我可以改变这个特定的行名。有什么建议么?请让我知道是否有任何更多的信息,我需要提供。我试图覆盖一切,但我不知道是否还有其他的东西。

解决方案

这不是很优雅,但它可能比使用两个函数分别填充行和列更好。



这里, x 是所有矩阵的列表; factor 是所需行列名称的可选列表

 < (x,factor){
f < - function(x)factor(ul < - unique(unlist(x)),levels = sort(ul))
$ if(missing(factors))
factors< - list(f(sapply(x,rownames)),
f(sapply(x,colnames)))

(因子[[1]]),长度(因子[[2]]),
dimnames =因子)

lapply(x,function xx){
## original
#xx < - rbind(xx,template [,colnames(xx)])
#xx < - cbind(xx,template [rownames(xx ),])
#xx [rownames(template),colnames(template)]
##更好http://stackoverflow.com/questions/31050787/r-how-to-match-join- 2-matrices-of-different-dimensions-nrow-ncol / 31051218#31051218
xx< - as.data.frame.table(xx)
template [as.matrix(xx [,1: 2])]< - xx $ Freq
模板
})
}

以下是我正在使用的数据

  l< ;  -  list(Group1 = matrix(c(1,1,1,NA,1,1,NA,NA,1),3,3,
dimnames = list(paste('Treatment',1:3 ),
c('Normal',paste('Treatment',1:2)))),
Group2 = matrix(c(1,1,NA,1),2,2,
dimnames = list(paste('Treatment',c(2,4)),
c('Normal','Treatment 2'))))

#$ Group1
#正常治疗1治疗2
#治疗1 1不适用不适用
#治疗2 1 1不适用
#治疗3 1 1 1

#$ Group2
#正常治疗2
#治疗2 1不适用
#治疗4 1 1

你可以像这样使用它。请注意,当您不提供因素时,函数将从您的矩阵列表中获取所有的行和列名称

  fix_rc(l)

#$组1
#正常治疗1治疗2
#治疗1 1不适用不适用
#治疗2 1 1 NA
#治疗3 1 1 1
#治疗4不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用b $ b#治疗1不适用不适用
#治疗2 1不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用1 1不适用1不适用不适用1不适用不适用不适用

我不确定所需输出的列中的治疗3是从哪里来的,但是如果您想这样的话,您可以在这里得到

  fix_rc(l,factors = list(paste('Treatment',1:6),
c('Normal ('Treatment',1:3))))

#$ Group1
#正常治疗1治疗2治疗3
#治疗1 1不适用不适用
#治疗2 1 1不适用不适用
#治疗3 1 1 1不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用b $ b#治疗3不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用不适用c $ c>


I am trying to retrieve the index of a newly-added row, added via a for loop.

Starting from the beginning, I have a list of matrices of p-values, each with a variable number of rows and columns. This is because not all groups have an adequate number of treated individuals to run t-tests. The following is what prints to the console when I access this sample list:

$Group1
                              Normal  Treatment 1  Treatment 2  
Treatment 1                        1           NA           NA
Treatment 2                        1            1           NA
Treatment 3                        1            1            1

$Group2
                              Normal  Treatment 2   
Treatment 2                        1           NA      
Treatment 4                        1            1     

I would like every group to have the same number of rows and columns, in the correct order, with the missing values just filled in with NAs. This is a sample of what I would like:

$Group1
                              Normal  Treatment 1  Treatment 2  Treatment 3 
Treatment 1                        1           NA           NA           NA
Treatment 2                        1            1           NA           NA
Treatment 3                        1            1            1           NA
Treatment 4                       NA           NA           NA           NA

$Group2
                              Normal  Treatment 1  Treatment 2  Treatment 3  
Treatment 1                       NA           NA           NA           NA
Treatment 2                        1           NA           NA           NA
Treatment 3                       NA           NA           NA           NA
Treatment 4                        1            1           NA           NA

Here is the code I have so far:

fix.results.row <- function(x, factors) {
  results.matrix <- x
  num <- 1
  for (i in factors){
    if (!i %in% rownames(results.matrix)) {
      results.matrix <- rbind(results.matrix, NA)
      rownames(results.matrix)[num] <- i
     } 
    num <- num + 1
  }
  rownames(results.matrix) <- results.matrix[rownames(factors),,drop=FALSE]
  return(results.matrix)
}

In the function above, x would be my list of matrices, and factors would be a list of all the factors in the order I want them. I have a similar function for adding columns.

My problem, as I see it, is in Group 2. If it sees that I'm missing Treatment 1, it will replace the rowname Treatment 2 with the rowname Treatment 1, so the data for Treatment 2 is now mislabeled Treatment 1. Then it reorders the variables the way I want them, but the data are already mislabeled!

If I could access the index of the newly-added row, which changes from group to group, then I could just change that specific row name. Any suggestions? Please let me know if there's any more information I need to provide. I tried to cover everything but I'm not sure if there's anything else you all need.

解决方案

This isn't very elegant, but it might work better than using two functions to fill in the rows and columns separately.

Here, x is a list of all your matrices; factor is an optional list of desired row and column names

fix_rc <- function(x, factors) {
  f <- function(x) factor(ul <- unique(unlist(x)), levels = sort(ul))
  if (missing(factors))
    factors <- list(f(sapply(x, rownames)),
                    f(sapply(x, colnames)))

  template <- matrix(NA, length(factors[[1]]), length(factors[[2]]),
                     dimnames = factors)

  lapply(x, function(xx) {
    ## original
    # xx <- rbind(xx, template[, colnames(xx)])
    # xx <- cbind(xx, template[rownames(xx), ])
    # xx[rownames(template), colnames(template)]
    ## better  http://stackoverflow.com/questions/31050787/r-how-to-match-join-2-matrices-of-different-dimensions-nrow-ncol/31051218#31051218
    xx <- as.data.frame.table(xx)
    template[as.matrix(xx[, 1:2])] <- xx$Freq
    template
  })
}

Here is the data I am using

l <- list(Group1 = matrix(c(1,1,1,NA,1,1,NA,NA,1), 3, 3,
                          dimnames = list(paste('Treatment', 1:3),
                                          c('Normal', paste('Treatment', 1:2)))),
          Group2 = matrix(c(1,1,NA,1), 2, 2,
                          dimnames = list(paste('Treatment', c(2,4)),
                                          c('Normal','Treatment 2'))))

# $Group1
#             Normal Treatment 1 Treatment 2
# Treatment 1      1          NA          NA
# Treatment 2      1           1          NA
# Treatment 3      1           1           1
# 
# $Group2
#             Normal Treatment 2
# Treatment 2      1          NA
# Treatment 4      1           1

And you can use it like this. Note that when you don't supply factors, the function will get all the row and column names from your list of matrices

fix_rc(l)

# $Group1
#             Normal Treatment 1 Treatment 2
# Treatment 1      1          NA          NA
# Treatment 2      1           1          NA
# Treatment 3      1           1           1
# Treatment 4     NA          NA          NA
# 
# $Group2
#             Normal Treatment 1 Treatment 2
# Treatment 1     NA          NA          NA
# Treatment 2      1          NA          NA
# Treatment 3     NA          NA          NA
# Treatment 4      1          NA           1

I'm not sure where treatment 3 in the columns in your desired output came from, but you can get that here if you want like so

fix_rc(l, factors = list(paste('Treatment', 1:6),
                         c('Normal', paste('Treatment', 1:3))))

# $Group1
#             Normal Treatment 1 Treatment 2 Treatment 3
# Treatment 1      1          NA          NA          NA
# Treatment 2      1           1          NA          NA
# Treatment 3      1           1           1          NA
# Treatment 4     NA          NA          NA          NA
# Treatment 5     NA          NA          NA          NA
# Treatment 6     NA          NA          NA          NA
# 
# $Group2
#             Normal Treatment 1 Treatment 2 Treatment 3
# Treatment 1     NA          NA          NA          NA
# Treatment 2      1          NA          NA          NA
# Treatment 3     NA          NA          NA          NA
# Treatment 4      1          NA           1          NA
# Treatment 5     NA          NA          NA          NA
# Treatment 6     NA          NA          NA          NA

这篇关于检索新添加的行的索引 - for循环在R中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆