将每个ID的0替换为先前的非零值(滞后) [英] Replace 0's with previous non-zero value per ID (lag)
本文介绍了将每个ID的0替换为先前的非零值(滞后)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何用R中每个ID的最后一个非零值替换全0?
How can I replace all 0's with the last non-zero value per ID in R?
示例:
输入:
df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0))
输出:
df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0),
res = c(0,10,30,30,30,50,80,0,0,57,57))
有滞后功能的简单方法吗?
Is there an easy way with lag function?
推荐答案
这是一种简单的方法:
library(tidyverse)
df %>%
group_by(ID) %>%
mutate(x = replace(Var1, cumsum(Var1 !=0) > 0 & Var1 == 0, NA)) %>%
fill(x)
# # A tibble: 11 x 4
# # Groups: ID [2]
# ID Var1 res x
# <dbl> <dbl> <dbl> <dbl>
# 1 1. 0. 0. 0.
# 2 1. 10. 10. 10.
# 3 1. 30. 30. 30.
# 4 1. 0. 30. 30.
# 5 1. 0. 30. 30.
# 6 1. 50. 50. 50.
# 7 1. 80. 80. 80.
# 8 2. 0. 0. 0.
# 9 2. 0. 0. 0.
# 10 2. 57. 57. 57.
# 11 2. 0. 57. 57.
在变异步骤中,我们将0替换为NA,但在每次ID运行开始时除外,因为在这种情况下,我们之后没有替换NA的值.
In the mutate step, we replace 0's with NA except for those that are at the beginning of each ID-run because in those cases we have no values to replace the NAs afterwards.
如果要调整多个列,则可以使用:
If you have multiple columns to adjust, you can use:
df %>%
group_by(ID) %>%
mutate_at(vars(starts_with("Var")),
funs(replace(., cumsum(. !=0) > 0 & . == 0, NA))) %>%
fill(starts_with("Var"))
其中df可能是:
df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0),
Var2 = c(4,0, 30, 0, 0,50,0,16, 0, 57, 0))
这篇关于将每个ID的0替换为先前的非零值(滞后)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文