如何在另一列中的列中检测和标记更改 [英] How to Detect and Mark Change within a Column in Another Column

查看:68
本文介绍了如何在另一列中的列中检测和标记更改的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试标记流程的开始和结束时间。
代码需要检测更改的开始时间和结束时间,并将其标记在另一列中。

I'm trying to mark when a process starts and ends. The code needs to detect when the change begins and when it ends, marking it so in another column.

示例数据:

date  process 
2007     0            
2008     1
2009     1
2010     1
2011     1
2012     1
2013     0

目标:

date  process        Status
2007     0             NA
2008     1        Process_START
2009     1             NA
2010     1             NA
2011     1             NA
2012     1        Process_END
2013     0             NA


推荐答案

也许可以通过计算 diff 并在两个方向上均进行滞后处理:

Maybe by calculating diff and lagging it in both directions:

dif <- diff(df1$process)
df1$Status <- factor(c(NA, dif) - 2 * c(dif, NA), levels = -3:3)
levels(df1$Status) <- c(rep(NA, 4), "Start", "End", "Start&End")
#   date process Status
# 1 2007       0   <NA>
# 2 2008       1  Start
# 3 2009       1   <NA>
# 4 2010       1   <NA>
# 5 2011       1   <NA>
# 6 2012       1    End
# 7 2013       0   <NA>



更新



没有因素的版本:

Update

Version without factors:

dif <- diff(df1$process)
df1$Status <- c(NA, dif) - 2 * c(dif, NA)
df1$Status <- c(rep(NA,4), "Start", "End", "Start&End")[df1$Status + 4]

请注意,如果是一年流程,则处于开始和结束的情况。

Note that in case of a single year process you have a "Start & End" situation.

如果该系列以process = 1开始(或结束),则预期输出可能不是NA,而是Start (或结束):

If the series starts (or ends) with process = 1 the expected output might not be NA but Start (or End):

dif <- diff(df1$process)
df1$Status <- c(df1$process[1], dif) - 2 * c(dif, -tail(df1$process,1))
df1$Status <- c(rep(NA,4), "Start", "End", "Start&End")[df1$Status + 4]

更复杂的示例:

set.seed(4)
df1 <- data.frame(date = 2007:(2007+24), process = sample(c(0,1, 1), 25, TRUE))

最后版本产生:

#    date process    Status
# 1  2007       1 Start&End
# 2  2008       0      <NA>
# 3  2009       0      <NA>
# 4  2010       0      <NA>
# 5  2011       1 Start&End
# 6  2012       0      <NA>
# 7  2013       1     Start
# 8  2014       1      <NA>
# 9  2015       1       End
# 10 2016       0      <NA>
# 11 2017       1 Start&End
# 12 2018       0      <NA>
# 13 2019       0      <NA>
# 14 2020       1     Start
# 15 2021       1      <NA>
# 16 2022       1      <NA>
# 17 2023       1      <NA>
# 18 2024       1      <NA>
# 19 2025       1      <NA>
# 20 2026       1      <NA>
# 21 2027       1      <NA>
# 22 2028       1      <NA>
# 23 2029       1      <NA>
# 24 2030       1      <NA>
# 25 2031       1       End

这篇关于如何在另一列中的列中检测和标记更改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆