加速循环和条件与R [英] Speed up loops and condition with R

查看:89
本文介绍了加速循环和条件与R的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在R中加快这个代码。



输入是一个数组3x3x3包含整数和基于邻居,如果他们是零,



输出是数组mask_roi,其中包含新值。

  ######从这里开始

list_neig = array(0,dim = c(3,3,3))

mask_roi = array(sample(c(0,1,2),27,replace = T),dim = c(3,3,3))

values_mask = array(1:27,dim = c(3,3,3))

values_mask_melted = melt(values_mask,varnames = c(x,y,z))

## #在数据表中转换3D矩阵4列位置和值
image_melted< - melt(mask_roi,varnames = c(x,y,z))#4 columns:x ,y,z,value

image_melted $ box = rownames(image_melted)

image_melted_non_zeros< -image_melted [!(image_melted $ value == 0),]

box_neigbors = vector(list,nrow(image_melted))

for(i in 1:(nrow(image_melted_non_zeros))){
cat(i,\ n)
x = image_melted_non_zeros [i,1]
y = image_melted_non_zeros [i,2]
z = image_melted_non_zeros [i,3]

box_neigbors [[image_melted_non_zeros [i ,5]]]< - list(nearestNeighbors(values_mask,elem = c(x,y,z),dist = 1,dim = c(3,3,3)))

}

我有box_neighbors向量完成,只是包含在这里展示如何获得它,需要从这里做更快到结束。想法是,检查所有体素不同的零,并检查所有的邻居。如果他的邻居是零,他将具有相同的值,如果不是零,保持原来的。

  for :(nrow(image_melted_non_zeros))){
cat(i,\\\

x = image_melted_non_zeros [i,1]
y = image_melted_non_zeros [i,2]
z = image_melted_non_zeros [i,3]

number_of_nei = length(box_neigbors [[image_melted_non_zeros [i,5]]] [[1]])
value_vozel = mask_roi [x,y,z]它会给这个新值

for(j in 1:number_of_nei){
nei_number = box_neigbors [[image_melted_non_zeros [i,5]]] [[1]] [j]

xx = image_melted [nei_number,1]
yy = image_melted [nei_number,2]
zz = image_melted [nei_number,3]

value_nei = mask_roi [ xx,yy,zz]

if(value_nei == 0){
mask_roi [xx,yy,zz] = value_vozel
}
}
}

我需要为256x256x256数组而不是3x3x3执行此操作。



非常感谢!

  nearestNeighbors<  -  function(ary,elem,dist,dims){
usedims< - mapply (el,d){
seq(max(1,el-dist),min(d,el + dist))
},elem,dims,SIMPLIFY = FALSE)
df ; - as.matrix(do.call('expand.grid',usedims))
ndist <-sqrt(apply(df,1,function(x)sum((x- elem)^ 2) )
ret< - df [which(ndist> 0& ndist< = dist),, drop = FALSE]

return(ary [ret])

}


解决方案

使用Kd树。它可以处理一个256x256x256阵列在〜13秒运行在MacBookPro与16GB RAM和2.3 GHz i.7处理器。你没有给出任何具体的基准,但我认为13s是合理的,足以发布答案。我已经概述了我的步骤如下。请让我知道,如果我误解了部分问题。



设置



我们有一个边长为n 。
框中的一个点由坐标i,j,k确定,
的范围为1到n。总共,该框包含n ^ 3个唯一点。
每个点都有一个关联的整数值0,1或2.



问题:

对于n = 256的盒子。
对于具有0值的每个点P,找到它的k最近的
非零值邻居并且用邻居的值更新P.
更新后,框中的每个点都应为非零。



解决方案:



我们的盒子有16,777,216(256 ^ 3)点,所以暴力的方法出来了。
幸运的是,这正是Kd树对
有用的 https:// en .wikipedia.org / wiki / K-d_tree
有几个关注度量数据结构的R库。
我使用FNN这个例子,因为我认为它有更强大的API
比替代 https://cran.r-project.org/web/packages/FNN/index.html



代码



框被表示为带有列名(i,j,k,value )。
每行代表框中的单个点。

  set.seed(256)
库FNN)
len = 256
values = c(0,1,2)
createBox = function(n,vals){
index = 1:len ^ 3
value = sample(vals,length(index),replace = T)
box = as.matrix(cbind(index,index,index,value))
dimnames c(i,j,k,value))
box
}
box = createBox(len,values)

knnx.index函数接受盒矩阵和查询矩阵(盒矩阵的子集)
作为参数,并返回最近邻

  updateZeroValuedPoints = function(box,kval){
zeroPointIndx = which [,value] == 0)
nonZeroPoints = box [-1 * zeroPointIndx,]
zeroPoints = box [zeroPointIndx,]
nnIdx = knnx.index(nonZeroPoints,zeroPoints,k = kval,algorithm =kd_tree)
zeroPoints [,value] = nonZeroPoints [nnIdx [,ncol(nnIdx)],value]
zeroPoints
}

一旦你有邻居索引,它是一个简单的交换更新值,不需要循环。

  system.time(updateZeroValuedPoints(box,1))
# system.time(updateZeroValuedPoints(box,1))
#用户系统已过
#13.517 1.162 14.676


b $ b

希望这是有用的,并且接近你的表现期望。


I would like to speed up this code in R.

The input is an array 3x3x3 containing integer number and based on the neighbors, if they are zero, replace them for the respective number.

The output is the array "mask_roi" with the new values.

###### Start here

list_neig = array(0, dim = c(3,3,3))

mask_roi = array(sample(c(0,1,2),27,replace=T), dim = c(3,3,3))

values_mask = array(1:27, dim = c(3,3,3))

values_mask_melted = melt(values_mask, varnames=c("x","y","z"))

### Tranform the 3D Matrix in a data.table wit 4 columns position and value
image_melted  <- melt(mask_roi, varnames=c("x","y","z"))  # 4 columns: x, y, z, value

image_melted$box = rownames(image_melted)

image_melted_non_zeros<-image_melted[!(image_melted$value==0),]

box_neigbors = vector("list", nrow(image_melted))

for (i in 1:(nrow(image_melted_non_zeros))){
  cat(i,"\n")
  x = image_melted_non_zeros[i,1]
  y = image_melted_non_zeros[i,2]
  z = image_melted_non_zeros[i,3] 

  box_neigbors[[image_melted_non_zeros[i,5]]] <- list(nearestNeighbors(values_mask, elem = c(x,y,z), dist = 1,dim = c(3,3,3)))

}

I have the "box_neighbors" vector done, just included it here to show how to get it, we need to make faster from here to the end. The idea is, check all voxel different of zero and check all his neighbors. If his neighbor is zero, he will have the same value, if not zero keep it the original.

for (i in 1:(nrow(image_melted_non_zeros))){
  cat(i,"\n")  
  x = image_melted_non_zeros[i,1]
  y = image_melted_non_zeros[i,2]
  z = image_melted_non_zeros[i,3] 

  number_of_nei = length(box_neigbors[[image_melted_non_zeros[i,5]]][[1]] )
  value_vozel = mask_roi[x,y,z]  # it will give this new value 

  for (j in 1:number_of_nei){
    nei_number = box_neigbors[[image_melted_non_zeros[i,5]]][[1]][j]

    xx = image_melted[nei_number,1]
    yy = image_melted[nei_number,2]
    zz = image_melted[nei_number,3]     

    value_nei = mask_roi[xx,yy,zz]

    if(value_nei == 0){      
      mask_roi[xx,yy,zz] = value_vozel
    }  
  }  
}

I need do this for 256x256x256 array not 3x3x3.

Thanks a lot!

nearestNeighbors <- function(ary, elem, dist, dims){
  usedims <- mapply(function(el, d) {
    seq(max(1, el - dist), min(d, el + dist))
  }, elem, dims, SIMPLIFY=FALSE)
  df <- as.matrix(do.call('expand.grid', usedims))
  ndist <- sqrt(apply(df, 1, function(x) sum((x - elem)^2)))
  ret <- df[which(ndist > 0 & ndist <= dist),,drop = FALSE]

  return(ary[ret])

}

解决方案

I put together an implementation that uses K-d Trees. It can process a 256x256x256 array in ~13 seconds running on a MacBookPro with 16GB RAM and 2.3 GHz i.7 processor. You didn't give any specific benchmark, but I think 13s is reasonable enough to post an answer. I've outlined my steps below. Please let me know if I have misunderstood part of the question.

Setup:

We have a box of side length n filled with points. A point in the box is determined by coordinates i,j,k which can range from 1 to n. In total, the box contains n^3 unique points. Each point has an associated integer value 0, 1 or 2.

The Problem:

With box having n = 256. For each point P having a 0 value, find its k nearest NON-ZERO-VALUED neighbor and update P with that neighbor's value. After the update every point in the box should be non-zero.

Solution:

Our box has 16,777,216 (256^3) points so brute force methods are out. Luckily this is exactly what K-d Trees are useful for https://en.wikipedia.org/wiki/K-d_tree. There are a few R libraries focused on metric data structures. I am using FNN for this example since I think it has a more robust API than alternatives https://cran.r-project.org/web/packages/FNN/index.html.

The Code:

The box is represented as a matrix with column names (i, j, k, value). Each row represents a single point in the box.

set.seed(256)
library(FNN)
len = 256
values = c(0, 1, 2)
createBox = function(n, vals) {
    index = 1:len^3
    value = sample(vals, length(index), replace = T)
    box = as.matrix(cbind(index, index, index, value))
    dimnames(box) = list(NULL, c("i", "j", "k", "value"))
    box
}
box= createBox(len, values)

The knnx.index function accepts the box matrix and a query matrix (subset of box matrix) as arguments and returns the nearest neighbor indices for each point in the query.

updateZeroValuedPoints = function(box, kval) {
  zeroPointIndx = which(box[ , "value"] == 0)
  nonZeroPoints = box[-1 * zeroPointIndx, ]
  zeroPoints = box[zeroPointIndx, ]
  nnIdx = knnx.index(nonZeroPoints, zeroPoints, k = kval, algorithm = "kd_tree")
  zeroPoints[, "value"] = nonZeroPoints[nnIdx[ , ncol(nnIdx)], "value"]
  zeroPoints
}

Once you have the neighbor indices it is a straightforward swap to update the values, no for loops required.

system.time(updateZeroValuedPoints(box, 1))
# > system.time(updateZeroValuedPoints(box, 1))
# user  system elapsed
# 13.517   1.162  14.676

Hopefully this is useful and somewhere near your performance expectations.

这篇关于加速循环和条件与R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆