将嵌套的循环减少为R中的单循环 [英] Reducing nested for loop to single loop in R

查看:82
本文介绍了将嵌套的循环减少为R中的单循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此嵌套的for循环可能需要花费一些时间才能运行,具体取决于规格,烫发和K的输入."pop"只是存储所有值的数组.烫发的价值很大,例如10,000.

This nested for loop can take quite some time to run depending on inputs to specs, perms and K. 'pop' is just an array to store all values. Perms is a large value, say 10,000.

K <- 1 

N <- 100 

Hstar <- 10 

perms <- 10000 

specs <- 1:N 

pop <- array(dim = c(c(perms, N), K))

haps <- as.character(1:Hstar)

probs <- rep(1/Hstar, Hstar) 

for(j in 1:perms){
    for(i in 1:K){ 
        if(i == 1){
            pop[j, specs, i] <- sample(haps, size = N, replace = TRUE, prob = probs)
    }
        else{
            pop[j ,, 1] <- sample(haps[s1], size = N, replace = TRUE, prob = probs[s1])
            pop[j ,, 2] <- sample(haps[s2], size = N, replace = TRUE, prob = probs[s2])

        }
    }
}

HAC.mat <- array(dim = c(c(perms, N), K))

for(k in specs){
    for(j in 1:perms){
        for(i in 1:K){ 
            ind.index <- sample(specs, size = k, replace = FALSE) 
            hap.plot <- pop[sample(1:nrow(pop), size = 1, replace = TRUE), ind.index, sample(1:K, size = 1, replace = TRUE)] 
            HAC.mat[j, k, i] <- length(unique(hap.plot))  
       }
   }
}


means <- apply(HAC.mat, MARGIN = 2, mean)
lower <- apply(HAC.mat, MARGIN = 2, function(x) quantile(x, 0.025))
upper <- apply(HAC.mat, MARGIN = 2, function(x) quantile(x, 0.975))

par(mfrow = c(1, 2))

plot(specs, means, type = "n", xlab = "Specimens sampled", ylab = "Unique haplotypes", ylim = c(1, Hstar))
polygon(x = c(specs, rev(specs)), y = c(lower, rev(upper)), col = "gray")
lines(specs, means, lwd = 2)
HAC.bar <- barplot(N*probs, xlab = "Unique haplotypes", ylab = "Specimens sampled", names.arg = 1:Hstar)

为使循环运行更快,我正在考虑将上述循环压缩为一个循环,并从1:(specs * perms)运行一个索引(i),并使用具有floor和ceiling函数的模块化算法来获得完成的工作.我不太确定如何最好地实现这一目标.

To make the loop run faster, I am thinking to condense the above loop into a single loop and having a single index (i) run from 1:(specs*perms) and using modular arithmetic with floor and ceiling functions to get the job done. I am not quite certain how best to implement this.

推荐答案

让我们使用RcppArmadillo. 但是首先,我需要对您的代码进行2件事更改:

Let's use RcppArmadillo. But first, I need to change 2 things to your code:

  • 使用pop作为整数而不是字符数组更容易(并且更快).使用uniquematch来创建对应表很容易.
  • 我需要对pop的前两个维度进行置换,以使访问更加连续.
  • It is easier (and faster) to work with pop as an array of integers rather than characters. It is easy to make a correspondence table using unique and match.
  • I need to permute the first two dimensions of pop so that the accesses are more contiguous.

用于生成pop的新代码:

K <- 1 
N <- 100 
Hstar <- 10 
perms <- 10000
specs <- 1:N 
pop <- array(dim = c(N, perms, K))
haps <- 1:Hstar
probs <- rep(1/Hstar, Hstar) 

for(j in 1:perms){
  for(i in 1:K){ 
    if(i == 1){
      pop[, j, i] <- sample(haps, size = N, replace = TRUE, prob = probs)
    }
    else{
      pop[, j, 1] <- sample(haps[s1], size = N, replace = TRUE, prob = probs[s1])
      pop[, j, 2] <- sample(haps[s2], size = N, replace = TRUE, prob = probs[s2])
    }
  }
}

RcppArmadillo代码生成HAC.mat:

RcppArmadillo code to generate HAC.mat:

// [[Rcpp::depends(RcppArmadillo)]]
#define ARMA_DONT_PRINT_OPENMP_WARNING
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
#include <set>
using namespace Rcpp;


int sample_one(int n) {
  return n * unif_rand();
} 

int sample_n_distinct(const IntegerVector& x, 
                      int k,
                      const int * pop_ptr) {

  IntegerVector ind_index = RcppArmadillo::sample(x, k, false); 
  std::set<int> distinct_container;

  for (int i = 0; i < k; i++) {
    distinct_container.insert(pop_ptr[ind_index[i]]);
  }

  return distinct_container.size();
}

// [[Rcpp::export]]
arma::Cube<int> fillCube(const arma::Cube<int>& pop,
                         const IntegerVector& specs,
                         int perms,
                         int K) {

  int N = specs.size();
  arma::Cube<int> res(perms, N, K);

  IntegerVector specs_C = specs - 1;
  const int * pop_ptr;
  int i, j, k;

  for (i = 0; i < K; i++) {
    for (k = 0; k < N; k++) {
      for (j = 0; j < perms; j++) {
        pop_ptr = &(pop(0, sample_one(perms), sample_one(K)));
        res(j, k, i) = sample_n_distinct(specs_C, k + 1, pop_ptr);
      }
    }
  }

  return res;
}

在R中:

Rcpp::sourceCpp('cube-sample.cpp')
HAC.mat <- fillCube(pop, specs, perms, K)

这是您计算机上版本的10倍.

This is 10 times as fast as your version on my computer.

这篇关于将嵌套的循环减少为R中的单循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆