向量化一个循环 [英] Vectorizing a loop

查看:141
本文介绍了向量化一个循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一些人造数据。我需要创建家庭ID(H_ID)和个人ID(P_ID,在每个家庭中)。

我找到了一种如何以矢量化的方式创建H_ID的方法。

  N < -  50 

###家用ID
#循环 - for
set.seed(20110224)
H_ID< - vector(integer,N)
H_ID [1] < - 1
for(i in 2:N)if (1)< .5)H_ID [i] < - H_ID [i-1] +1 else H_ID [i] < - H_ID [i-1]
print(H_ID)

#矢量化形式
set.seed(20110224)
r <-c(0,runif(N-1))
H_ID< cumsum(r< .5)
print(H_ID)

但是我不知道如何以矢量化方式创建P_ID 。

  ###人物ID 
#loop-for
P_ID< - vector(integer如果(H_ID [i]> H_ID [i-1])P_ID [i] < - >,则对于(i in 2:N) 1 else P_ID [i] < - P_ID [i-1] +1
print(cbind(H_ID,P_ID))

#向量化格式


解决方案

马丁摩根的解决方案到一个密切相关的问题,这是一个真正的矢量化的方式来产生 P_ID 使用 cummax 函数。一旦注意到 P_ID !(r)的 cumsum 密切相关, < 0.5)

  set.seed(1)
N < - (0,runif(N-1))
H_ID< cumsum(r< .5)
r_< r> = 0.5#翻转生成H_ID的硬币。
z< - cumsum(r_)#这几乎是P_ID;只需要减去适当的量...
#...并且通过cummax
P_ID < - 1 + z - cummax(z *(!r_))
> cbind(H_ID,P_ID)
H_ID P_ID
[1,] 1 1
[2,] 1 2
[3,] 2 1
[4,] 3 1
[5,] 3 2
[6,] 3 3
[7,] 3 4
[8,] 4 1
[9,] 5 1
[10,] 5 2

我没有做详细的时序测试,但是它可能很快就会变得邪恶,因为这些都是内部的,矢量化的函数

I am creating some artificial data. I need to create household ID (H_ID) and personal ID (P_ID, in each household).

I found a way how to create H_ID in vectorized way.

N <- 50

### Household ID
# loop-for
set.seed(20110224)
H_ID <- vector("integer", N)
H_ID[1] <- 1
for (i in 2:N) if (runif(1) < .5) H_ID[i] <- H_ID[i-1]+1 else H_ID[i] <- H_ID[i-1]
print(H_ID)

# vectorised form
set.seed(20110224)
r <- c(0, runif(N-1))
H_ID <- cumsum(r < .5)
print(H_ID)

But I can not figure out how to create P_ID in vectorized way.

### Person ID
# loop-for
P_ID <- vector("integer", N)
P_ID[1] <- 1
for (i in 2:N) if (H_ID[i] > H_ID[i-1]) P_ID[i] <- 1 else P_ID[i] <- P_ID[i-1]+1
print(cbind(H_ID, P_ID))

# vectorised form
# ???

解决方案

Inspired by Martin Morgan's solution to a closely related question, here's a truly vectorized way to generate the P_ID using the cummax function. It becomes clear once you note that P_ID is closely related to the cumsum of !(r < 0.5):

set.seed(1)
N <- 10
r <- c(0, runif(N-1))
H_ID <- cumsum(r < .5)
r_ <- r >= .5 # flip the coins that generated H_ID.
z <- cumsum(r_)  # this is almost P_ID; just need to subtract the right amount...
# ... and the right amount to subtract is obtained via cummax
P_ID <- 1 + z - cummax( z * (!r_) )
> cbind(H_ID, P_ID)
      H_ID P_ID
 [1,]    1    1
 [2,]    1    2
 [3,]    2    1
 [4,]    3    1
 [5,]    3    2
 [6,]    3    3
 [7,]    3    4
 [8,]    4    1
 [9,]    5    1
[10,]    5    2

I haven't done detailed timing tests, but it's probably wicked fast, since these are all internal, vectorized functions

这篇关于向量化一个循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆