如何使R foreach循环有效 [英] How to make R foreach loops efficient

查看:79
本文介绍了如何使R foreach循环有效的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在R中计算一个300,000x300,000的矩阵,我的代码运行得很好,但是现在已经运行了好几天,如何使它更高效,更节省时间?

我的代码运行良好,但是已经运行了好几天了,其附件是我正在使用的代码的子集,ID扩展为300,000;既然已经运行了几天,我如何才能使代码在几分钟之内更快地运行.

fam <- structure(list(ID = c(1L, 2L, 3L, 4L, 6L, 5L, 7L), dad = c(0L, 
0L, 1L, 1L, 1L, 3L, 5L), mum = c(0L, 0L, 0L, 2L, 4L, 4L, 6L), 
    GEN = c(1L, 1L, 2L, 2L, 3L, 3L, 4L)), class = "data.frame", row.names = c(NA, 
-7L))


hom<-function(data) { 
    library(Matrix)
    library(foreach)
    n<-max(as.numeric(fam[,"ID"])) 
    t<-min(as.numeric(fam[,"ID"])) 
    A<-Matrix(0,nrow=n,ncol=n, sparse=TRUE)

    while(t <=n) {

s<-max(fam[t,"dad"],fam[t,"mum"]) 
d<-min(fam[t,"dad"],fam[t,"mum"])
if (s>0 & d>0 ) 
{ 
  if (fam[t,"GEN"]==999 & s!=d) 
  { warning("both dad and mum should be the same, different for at least       one individual")
    NULL    
  }

  A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)+0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]
  foreach(j = 1:(t-1), .verbose=TRUE,  .combine='c', .packages=c("Matrix", "foreach")) %do%

    { 
      A[t,j]<- 0.5*(A[j,fam[t,"dad"]]+A[j,fam[t,"mum"]])
      A[j,t]<- A[t,j] 
    } 
} 
if (s>0 & d==0 )
{ 
  if ( fam[t,"GEN"]==999) 
  { warning("both dad and mum should be the same, one parent equal to zero for at least individual")
    NULL }
  A[t,t]<- 2-0.5^(fam[t,"GEN"]-1) 
  foreach(j = 1:(t-1), .verbose=TRUE,  .combine='c', .packages=c("Matrix", "foreach")) %do%  
    { 
      A[t,j]<-0.5*A[j,s]
      A[j,t]<-A[t,j] 
    } 
} 
if (s==0 )
{
  A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)
}

cat(" MatbyGEN: ", t ,"\n") 
t <- t+1


} 

A

}


Output of the above example
%%MatrixMarket matrix coordinate real symmetric
7 7 26
1 1 1
3 1 .5
4 1 .5
5 1 .75
6 1 .5
7 1 .625
2 2 1
4 2 .5
5 2 .25
6 2 .25
7 2 .25
3 3 1.5
4 3 .25
5 3 .375
6 3 .875
7 3 .625
4 4 1.5
5 4 1
6 4 .875
7 4 .9375
5 5 1.8125
6 5 .6875
7 5 1.25
6 6 1.78125
7 6 1.234375
7 7 1.91796875

问题是要使其在300k x 300k的矩阵中更快地工作,这将需要几天或几周的时间才能运行,因为我已经运行了一段时间,如何使它运行得更快?

N.B:将示例另存为"anything.txt",然后以"fam<-read.delim(,header = TRUE,sep =")"读取文件

解决方案

您遇到的问题是这是递归的.每个循环取决于前一个循环的结果.因此,您不能真正使用向量化来解决问题.

如果要为此使用R,最好的选择是查看Rcpp.我对Rcpp不太满意,但是我有一些建议.

最简单的方法是摆脱foreach循环,然后将其替换为常规的for循环.使用并行线程会产生很多开销,并且当函数是递归的时,工作人员很难真正自己做得更好.

# Before

foreach(j = 1:(t-1),  .combine='c', .packages=c("Matrix", "foreach")) %do%
{ ... }

# After
for (j in 1:(t-1)) {
...
}

接下来要做的是考虑您是否真的需要稀疏矩阵.如果您没有内存问题,则最好使用常规矩阵.

A<-Matrix(0,nrow=n,ncol=n, sparse=TRUE)
# to
A<-matrix(0,nrow=n,ncol=n)

最后要做的是重新考虑如何初始化所有内容.该代码的某些部分会重复多次,例如分配给diag.由于我们要对单独的元素求和,因此可以使用所有3个代码段2 - 0.5^(fam[t, 'GEN'] - 1)共有的部分来初始化diag.

A<-matrix(0,nrow=n,ncol=n)
diag(A) <- 2-0.5^(fam[["GEN"]]-1)

这很重要,因为这可以让我们跳过.您的原始代码段中有1,000行,其中"mum"和"dad"的值为0.通过这种初始化,我们可以直接跳到第一行,并且'mum'或'dad'的结果为非零:

  t_start <- min(which.max(fam$dad > 0), which.max(fam$mum > 0))
  t_end <- max(fam[['ID']])

  for (t in t_start:t_end) {
...
}

我决定为了跳过if语句,我想使用sum(c(..., ...))总结所有内容.这样,如果子集导致NULL,我仍然可以求和.总共:

  t_start <- min(which.max(fam$dad > 0), which.max(fam$mum > 0))
  t_end <- max(fam[['ID']])

  A<-matrix(0,nrow=t_end,ncol=t_end)
  diag(A) <- 2-0.5^(fam[["GEN"]]-1)

  for (t in t_start:t_end) {
    A[t,t]<- sum(c(A[t,t], 0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]))

    for(j in 1:(t-1))  {
      A[t,j]<- 0.5 * sum(c(A[j,fam[t,"dad"]],A[j,fam[t,"mum"]]))
      A[j,t]<- A[t,j]
    }
  }
  A

性能

Unit: microseconds
                expr       min         lq      mean    median        uq     max neval
            original 85759.901 86650.7515 88776.695 88740.050 90529.750 97433.2   100
         non_foreach 47912.601 48528.5010 50699.867 50220.901 51782.651 88355.1   100
 non_sparse_non_each  1423.701  1454.3015  1531.833  1471.451  1496.401  4126.3   100
        final_change   953.102   981.8015  1212.264  1010.500  1026.052 21350.1   100

所有代码

fam <- structure(list(ID = c(1L, 2L, 3L, 4L, 6L, 5L, 7L), dad = c(0L, 
                                                                  0L, 1L, 1L, 1L, 3L, 5L), mum = c(0L, 0L, 0L, 2L, 4L, 4L, 6L), 
                      GEN = c(1L, 1L, 2L, 2L, 3L, 3L, 4L)), class = "data.frame", row.names = c(NA, 
                                                                                                -7L))
A<-matrix(0,nrow=7,ncol=7)
diag(A) <- 2-0.5^(fam[["GEN"]]-1)

t_start <- min(which.max(fam$dad > 0), which.max(fam$mum > 0))
t_end <- max(fam[['ID']])

for (t in t_start:t_end) {
  A[t,t]<- sum(c(A[t,t], 0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]))

  for(j in 1:(t-1))  {
    A[t,j]<- 0.5 * sum(c(A[j,fam[t,"dad"]],A[j,fam[t,"mum"]]))
    A[j,t]<- A[t,j]
  }
}
A

hom<-function(data) { 
  library(Matrix)
  library(foreach)
  n<-max(as.numeric(fam[,"ID"])) 
  t<-min(as.numeric(fam[,"ID"])) 
  A<-Matrix(0,nrow=n,ncol=n, sparse=TRUE)

  while(t <=n) {

    s<-max(fam[t,"dad"],fam[t,"mum"]) 
    d<-min(fam[t,"dad"],fam[t,"mum"])
    if (s>0 & d>0 ) 
    { 
      if (fam[t,"GEN"]==999 & s!=d) 
      { warning("both dad and mum should be the same, different for at least       one individual")
        NULL    
      }

      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)+0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]
      foreach(j = 1:(t-1),  .combine='c', .packages=c("Matrix", "foreach")) %do%

      { 
        A[t,j]<- 0.5*(A[j,fam[t,"dad"]]+A[j,fam[t,"mum"]])
        A[j,t]<- A[t,j] 
      } 
    } 
    if (s>0 & d==0 )
    { 
      if ( fam[t,"GEN"]==999) 
      { warning("both dad and mum should be the same, one parent equal to zero for at least individual")
        NULL }
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1) 
      foreach(j = 1:(t-1),  .combine='c', .packages=c("Matrix", "foreach")) %do%  
      { 
        A[t,j]<-0.5*A[j,s]
        A[j,t]<-A[t,j] 
      } 
    } 
    if (s==0 )
    {
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)
    }

    # cat(" MatbyGEN: ", t ,"\n") 
    t <- t+1


  } 

  A

}

hom2<-function(data) { 
  library(Matrix)
  n<-max(as.numeric(fam[,"ID"])) 
  t<-min(as.numeric(fam[,"ID"])) 
  A<-Matrix(0,nrow=n,ncol=n, sparse = T)

  while(t <=n) {

    s<-max(fam[t,"dad"],fam[t,"mum"]) 
    d<-min(fam[t,"dad"],fam[t,"mum"])
    if (s>0 & d>0 ) 
    { 
      if (fam[t,"GEN"]==999 & s!=d) 
      { warning("both dad and mum should be the same, different for at least       one individual")
        NULL    
      }

      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)+0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]
      for (j in 1:(t-1)) { 
        A[t,j]<- 0.5*(A[j,fam[t,"dad"]]+A[j,fam[t,"mum"]])
        A[j,t]<- A[t,j] 
      } 
    } 
    if (s>0 & d==0 )
    { 
      if ( fam[t,"GEN"]==999) 
      { warning("both dad and mum should be the same, one parent equal to zero for at least individual")
        NULL }
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1) 
      for (j in 1:(t-1)) { 
        A[t,j]<-0.5*A[j,s]
        A[j,t]<-A[t,j] 
      } 
    } 
    if (s==0 )
    {
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)
    }

    # cat(" MatbyGEN: ", t ,"\n") 
    t <- t+1


  } 

  A

}

hom3<-function(data) { 
  n<-max(as.numeric(fam[,"ID"])) 
  t<-min(as.numeric(fam[,"ID"])) 
  A<-matrix(0,nrow=n,ncol=n)

  while(t <=n) {

    s<-max(fam[t,"dad"],fam[t,"mum"]) 
    d<-min(fam[t,"dad"],fam[t,"mum"])
    if (s>0 & d>0 ) 
    { 
      if (fam[t,"GEN"]==999 & s!=d) 
      { warning("both dad and mum should be the same, different for at least       one individual")
        NULL    
      }

      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)+0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]
      for (j in 1:(t-1)) { 
        A[t,j]<- 0.5*(A[j,fam[t,"dad"]]+A[j,fam[t,"mum"]])
        A[j,t]<- A[t,j] 
      } 
    } 
    if (s>0 & d==0 )
    { 
      if ( fam[t,"GEN"]==999) 
      { warning("both dad and mum should be the same, one parent equal to zero for at least individual")
        NULL }
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1) 
      for (j in 1:(t-1)) { 
        A[t,j]<-0.5*A[j,s]
        A[j,t]<-A[t,j] 
      } 
    } 
    if (s==0 )
    {
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)
    }

    # cat(" MatbyGEN: ", t ,"\n") 
    t <- t+1


  } 

  A

}

library(microbenchmark)
f_changed = function(fam) {
  t_start <- min(which.max(fam$dad > 0), which.max(fam$mum > 0))
  t_end <- max(fam[['ID']])

  A<-matrix(0,nrow=t_end,ncol=t_end)
  diag(A) <- 2-0.5^(fam[["GEN"]]-1)

  for (t in t_start:t_end) {
    A[t,t]<- sum(c(A[t,t], 0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]))

    for(j in 1:(t-1))  {
      A[t,j]<- 0.5 * sum(c(A[j,fam[t,"dad"]],A[j,fam[t,"mum"]]))
      A[j,t]<- A[t,j]
    }
  }
  A
}
microbenchmark(
  original = {
    hom(fam)
  }
  , non_foreach = {
    hom2(fam)
  }
  , non_sparse_non_each = {
    hom3(fam)
  }
  , final_change = {
  f_changed(fam)
  }
,times = 100
)

I am trying to compute a 300,000x300,000 matrix in R, my codes are working quite well but it's been running for days now, how can i make it more efficient and time saving?

My codes are working well but it has been running for days now, attached are a subset of what I'm working with, the ID extends to 300,000; how can i make the codes run faster in minutes as it has been running for days now.

fam <- structure(list(ID = c(1L, 2L, 3L, 4L, 6L, 5L, 7L), dad = c(0L, 
0L, 1L, 1L, 1L, 3L, 5L), mum = c(0L, 0L, 0L, 2L, 4L, 4L, 6L), 
    GEN = c(1L, 1L, 2L, 2L, 3L, 3L, 4L)), class = "data.frame", row.names = c(NA, 
-7L))


hom<-function(data) { 
    library(Matrix)
    library(foreach)
    n<-max(as.numeric(fam[,"ID"])) 
    t<-min(as.numeric(fam[,"ID"])) 
    A<-Matrix(0,nrow=n,ncol=n, sparse=TRUE)

    while(t <=n) {

s<-max(fam[t,"dad"],fam[t,"mum"]) 
d<-min(fam[t,"dad"],fam[t,"mum"])
if (s>0 & d>0 ) 
{ 
  if (fam[t,"GEN"]==999 & s!=d) 
  { warning("both dad and mum should be the same, different for at least       one individual")
    NULL    
  }

  A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)+0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]
  foreach(j = 1:(t-1), .verbose=TRUE,  .combine='c', .packages=c("Matrix", "foreach")) %do%

    { 
      A[t,j]<- 0.5*(A[j,fam[t,"dad"]]+A[j,fam[t,"mum"]])
      A[j,t]<- A[t,j] 
    } 
} 
if (s>0 & d==0 )
{ 
  if ( fam[t,"GEN"]==999) 
  { warning("both dad and mum should be the same, one parent equal to zero for at least individual")
    NULL }
  A[t,t]<- 2-0.5^(fam[t,"GEN"]-1) 
  foreach(j = 1:(t-1), .verbose=TRUE,  .combine='c', .packages=c("Matrix", "foreach")) %do%  
    { 
      A[t,j]<-0.5*A[j,s]
      A[j,t]<-A[t,j] 
    } 
} 
if (s==0 )
{
  A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)
}

cat(" MatbyGEN: ", t ,"\n") 
t <- t+1


} 

A

}


Output of the above example
%%MatrixMarket matrix coordinate real symmetric
7 7 26
1 1 1
3 1 .5
4 1 .5
5 1 .75
6 1 .5
7 1 .625
2 2 1
4 2 .5
5 2 .25
6 2 .25
7 2 .25
3 3 1.5
4 3 .25
5 3 .375
6 3 .875
7 3 .625
4 4 1.5
5 4 1
6 4 .875
7 4 .9375
5 5 1.8125
6 5 .6875
7 5 1.25
6 6 1.78125
7 6 1.234375
7 7 1.91796875

The issue is getting it to work faster for a matrix of 300k x 300k, this would take days or weeks to run as i have been running it for a while now, what can i do to make it run faster?

N.B: save the example as "anything.txt", then read the file in as "fam <- read.delim(, header = TRUE, sep="")"

解决方案

The problem you have is that this is recursive. Each loop depends on the previous loop's results. Therefore, you can't really use vectorization to solve the problem.

If you want to use R for this, you're best bet is to look into Rcpp. I'm not that good with Rcpp but I do have some suggestions.

The easiest thing to do is to get rid of the foreach loop and replace it with a regular for loop. There's a lot of overhead to use parallel threads and when a function is recursive, it's hard for the workers to really do better on their own.

# Before

foreach(j = 1:(t-1),  .combine='c', .packages=c("Matrix", "foreach")) %do%
{ ... }

# After
for (j in 1:(t-1)) {
...
}

The next thing to do is to contemplate whether you really need a sparse matrix. If you're not having memory problems, you might as well use a regular matrix.

A<-Matrix(0,nrow=n,ncol=n, sparse=TRUE)
# to
A<-matrix(0,nrow=n,ncol=n)

The last thing to do is to rethink how you initialize everything. Parts of that code gets repeated multiple times like the assignment to the diag. Since we're summing separate elements, we can initialize the diag with the part common to all 3 code snippets 2 - 0.5^(fam[t, 'GEN'] - 1).

A<-matrix(0,nrow=n,ncol=n)
diag(A) <- 2-0.5^(fam[["GEN"]]-1)

This is important because that allows us to skip ahead. Your original code snippet had like, 1,000 rows with 0s for 'mum' and 'dad'. With this initialization, we can skip right ahead to the first row with a non-zero result for 'mum' or 'dad':

  t_start <- min(which.max(fam$dad > 0), which.max(fam$mum > 0))
  t_end <- max(fam[['ID']])

  for (t in t_start:t_end) {
...
}

I decided in the interest of skipping if statements, I wanted to use sum(c(..., ...)) to sum up everything. That way, if the subset resulted in a NULL, I could still sum. Altogether:

  t_start <- min(which.max(fam$dad > 0), which.max(fam$mum > 0))
  t_end <- max(fam[['ID']])

  A<-matrix(0,nrow=t_end,ncol=t_end)
  diag(A) <- 2-0.5^(fam[["GEN"]]-1)

  for (t in t_start:t_end) {
    A[t,t]<- sum(c(A[t,t], 0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]))

    for(j in 1:(t-1))  {
      A[t,j]<- 0.5 * sum(c(A[j,fam[t,"dad"]],A[j,fam[t,"mum"]]))
      A[j,t]<- A[t,j]
    }
  }
  A

Performance

Unit: microseconds
                expr       min         lq      mean    median        uq     max neval
            original 85759.901 86650.7515 88776.695 88740.050 90529.750 97433.2   100
         non_foreach 47912.601 48528.5010 50699.867 50220.901 51782.651 88355.1   100
 non_sparse_non_each  1423.701  1454.3015  1531.833  1471.451  1496.401  4126.3   100
        final_change   953.102   981.8015  1212.264  1010.500  1026.052 21350.1   100

All code

fam <- structure(list(ID = c(1L, 2L, 3L, 4L, 6L, 5L, 7L), dad = c(0L, 
                                                                  0L, 1L, 1L, 1L, 3L, 5L), mum = c(0L, 0L, 0L, 2L, 4L, 4L, 6L), 
                      GEN = c(1L, 1L, 2L, 2L, 3L, 3L, 4L)), class = "data.frame", row.names = c(NA, 
                                                                                                -7L))
A<-matrix(0,nrow=7,ncol=7)
diag(A) <- 2-0.5^(fam[["GEN"]]-1)

t_start <- min(which.max(fam$dad > 0), which.max(fam$mum > 0))
t_end <- max(fam[['ID']])

for (t in t_start:t_end) {
  A[t,t]<- sum(c(A[t,t], 0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]))

  for(j in 1:(t-1))  {
    A[t,j]<- 0.5 * sum(c(A[j,fam[t,"dad"]],A[j,fam[t,"mum"]]))
    A[j,t]<- A[t,j]
  }
}
A

hom<-function(data) { 
  library(Matrix)
  library(foreach)
  n<-max(as.numeric(fam[,"ID"])) 
  t<-min(as.numeric(fam[,"ID"])) 
  A<-Matrix(0,nrow=n,ncol=n, sparse=TRUE)

  while(t <=n) {

    s<-max(fam[t,"dad"],fam[t,"mum"]) 
    d<-min(fam[t,"dad"],fam[t,"mum"])
    if (s>0 & d>0 ) 
    { 
      if (fam[t,"GEN"]==999 & s!=d) 
      { warning("both dad and mum should be the same, different for at least       one individual")
        NULL    
      }

      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)+0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]
      foreach(j = 1:(t-1),  .combine='c', .packages=c("Matrix", "foreach")) %do%

      { 
        A[t,j]<- 0.5*(A[j,fam[t,"dad"]]+A[j,fam[t,"mum"]])
        A[j,t]<- A[t,j] 
      } 
    } 
    if (s>0 & d==0 )
    { 
      if ( fam[t,"GEN"]==999) 
      { warning("both dad and mum should be the same, one parent equal to zero for at least individual")
        NULL }
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1) 
      foreach(j = 1:(t-1),  .combine='c', .packages=c("Matrix", "foreach")) %do%  
      { 
        A[t,j]<-0.5*A[j,s]
        A[j,t]<-A[t,j] 
      } 
    } 
    if (s==0 )
    {
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)
    }

    # cat(" MatbyGEN: ", t ,"\n") 
    t <- t+1


  } 

  A

}

hom2<-function(data) { 
  library(Matrix)
  n<-max(as.numeric(fam[,"ID"])) 
  t<-min(as.numeric(fam[,"ID"])) 
  A<-Matrix(0,nrow=n,ncol=n, sparse = T)

  while(t <=n) {

    s<-max(fam[t,"dad"],fam[t,"mum"]) 
    d<-min(fam[t,"dad"],fam[t,"mum"])
    if (s>0 & d>0 ) 
    { 
      if (fam[t,"GEN"]==999 & s!=d) 
      { warning("both dad and mum should be the same, different for at least       one individual")
        NULL    
      }

      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)+0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]
      for (j in 1:(t-1)) { 
        A[t,j]<- 0.5*(A[j,fam[t,"dad"]]+A[j,fam[t,"mum"]])
        A[j,t]<- A[t,j] 
      } 
    } 
    if (s>0 & d==0 )
    { 
      if ( fam[t,"GEN"]==999) 
      { warning("both dad and mum should be the same, one parent equal to zero for at least individual")
        NULL }
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1) 
      for (j in 1:(t-1)) { 
        A[t,j]<-0.5*A[j,s]
        A[j,t]<-A[t,j] 
      } 
    } 
    if (s==0 )
    {
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)
    }

    # cat(" MatbyGEN: ", t ,"\n") 
    t <- t+1


  } 

  A

}

hom3<-function(data) { 
  n<-max(as.numeric(fam[,"ID"])) 
  t<-min(as.numeric(fam[,"ID"])) 
  A<-matrix(0,nrow=n,ncol=n)

  while(t <=n) {

    s<-max(fam[t,"dad"],fam[t,"mum"]) 
    d<-min(fam[t,"dad"],fam[t,"mum"])
    if (s>0 & d>0 ) 
    { 
      if (fam[t,"GEN"]==999 & s!=d) 
      { warning("both dad and mum should be the same, different for at least       one individual")
        NULL    
      }

      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)+0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]
      for (j in 1:(t-1)) { 
        A[t,j]<- 0.5*(A[j,fam[t,"dad"]]+A[j,fam[t,"mum"]])
        A[j,t]<- A[t,j] 
      } 
    } 
    if (s>0 & d==0 )
    { 
      if ( fam[t,"GEN"]==999) 
      { warning("both dad and mum should be the same, one parent equal to zero for at least individual")
        NULL }
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1) 
      for (j in 1:(t-1)) { 
        A[t,j]<-0.5*A[j,s]
        A[j,t]<-A[t,j] 
      } 
    } 
    if (s==0 )
    {
      A[t,t]<- 2-0.5^(fam[t,"GEN"]-1)
    }

    # cat(" MatbyGEN: ", t ,"\n") 
    t <- t+1


  } 

  A

}

library(microbenchmark)
f_changed = function(fam) {
  t_start <- min(which.max(fam$dad > 0), which.max(fam$mum > 0))
  t_end <- max(fam[['ID']])

  A<-matrix(0,nrow=t_end,ncol=t_end)
  diag(A) <- 2-0.5^(fam[["GEN"]]-1)

  for (t in t_start:t_end) {
    A[t,t]<- sum(c(A[t,t], 0.5^(fam[t,"GEN"])*A[fam[t,"dad"],fam[t,"mum"]]))

    for(j in 1:(t-1))  {
      A[t,j]<- 0.5 * sum(c(A[j,fam[t,"dad"]],A[j,fam[t,"mum"]]))
      A[j,t]<- A[t,j]
    }
  }
  A
}
microbenchmark(
  original = {
    hom(fam)
  }
  , non_foreach = {
    hom2(fam)
  }
  , non_sparse_non_each = {
    hom3(fam)
  }
  , final_change = {
  f_changed(fam)
  }
,times = 100
)

这篇关于如何使R foreach循环有效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆