在某些值的连续运行中创建计数器 [英] Create counter within consecutive runs of certain values

查看:32
本文介绍了在某些值的连续运行中创建计数器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个小时值.我想计算自上次不为零以来该值连续多少小时为零.对于电子表格或 for 循环来说,这是一项简单的工作,但我希望有一个活泼的矢量化单行代码来完成这项任务.

I have an hourly value. I want to count how many consecutive hours the value has been zero since the last time it was not zero. This is an easy job for a spreadsheet or for loop, but I am hoping for a snappy vectorized one-liner to accomplish the task.

x <- c(1, 0, 1, 0, 0, 0, 1, 1, 0, 0)
df <- data.frame(x, zcount = NA)

df$zcount[1] <- ifelse(df$x[1] == 0, 1, 0)
for(i in 2:nrow(df)) 
  df$zcount[i] <- ifelse(df$x[i] == 0, df$zcount[i - 1] + 1, 0)

所需的输出:

R> df
   x zcount
1  1      0
2  0      1
3  1      0
4  0      1
5  0      2
6  0      3
7  1      0
8  1      0
9  0      1
10 0      2

推荐答案

这里有一种方法,建立在 Joshua 的 rle 方法之上:(已编辑使用 seq_lenlapply 根据 Marek 的建议)

Here's a way, building on Joshua's rle approach: (EDITED to use seq_len and lapply as per Marek's suggestion)

> (!x) * unlist(lapply(rle(x)$lengths, seq_len))
 [1] 0 1 0 1 2 3 0 0 1 2

更新.只是为了踢球,这是另一种方法,大约快 5 倍:

UPDATE. Just for kicks, here's another way to do it, around 5 times faster:

cumul_zeros <- function(x)  {
  x <- !x
  rl <- rle(x)
  len <- rl$lengths
  v <- rl$values
  cumLen <- cumsum(len)
  z <- x
  # replace the 0 at the end of each zero-block in z by the 
  # negative of the length of the preceding 1-block....
  iDrops <- c(0, diff(v)) < 0
  z[ cumLen[ iDrops ] ] <- -len[ c(iDrops[-1],FALSE) ]
  # ... to ensure that the cumsum below does the right thing.
  # We zap the cumsum with x so only the cumsums for the 1-blocks survive:
  x*cumsum(z)
}

尝试一个例子:

> cumul_zeros(c(1,1,1,0,0,0,0,0,1,1,1,0,0,1,1))
 [1] 0 0 0 1 2 3 4 5 0 0 0 1 2 0 0

现在比较一百万长度的向量的时间:

Now compare times on a million-length vector:

> x <- sample(0:1, 1000000,T)
> system.time( z <- cumul_zeros(x))
   user  system elapsed 
   0.15    0.00    0.14 
> system.time( z <- (!x) * unlist( lapply( rle(x)$lengths, seq_len)))
   user  system elapsed 
   0.75    0.00    0.75 

故事的寓意:单行代码更好、更容易理解,但并不总是最快的!

Moral of the story: one-liners are nicer and easier to understand, but not always the fastest!

这篇关于在某些值的连续运行中创建计数器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆