在dplyr链中结转价值 [英] Carry value forward in dplyr chain
问题描述
假设我有下面的列
**CurrentStatus**
Current
NoChange
NoChange
NoChange
NoChange
Late
我要对其进行更改,以便如果该值为 NoChange,则使用先前的值。
I want to mutate it so that if the value is "NoChange" it uses the prior value.
我尝试过:
myDF %>% mutate(CurrentStatus = ifelse(CurrentStatus == "NoChange", lag(CurrentStatus), CurrentStatus)
这似乎不起作用-我认为这是因为它进行了矢量化计算,因此可以同时查看所有延迟。我需要它向前滚动。我想知道没有for循环的最有效方法是什么。我特别想避免for循环,因为有些分组变量没有显示出来,我需要注意。
That doesn't seem to work -- I think it's because it does a vectorized calculation so it looks at all the lags at the same time. I need it to "roll forward". I was wondering what's the most efficient way to do this without a for loop. I specifically want to avoid a for loop as there are some grouping variables not shown that I need to be mindful of.
谢谢!
推荐答案
我们可以将'NoChange'替换为 NA
,然后使用 fill
We can replace the 'NoChange' to NA
and then use fill
library(tidyverse)
myDF %>%
mutate(CurrentStatus = replace(CurrentStatus, CurrentStatus == "NoChange", NA)) %>%
fill(CurrentStatus)
# CurrentStatus
#1 Current
#2 Current
#3 Current
#4 Current
#5 Current
#6 Late
或另一种选择是 na.locf
来自 zoo
library(zoo)
myDF$CurrentStatus <- with(myDF, na.locf(replace(CurrentStatus,
CurrentStatus == "NoChange", NA)))
这篇关于在dplyr链中结转价值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!