将每个ID的0替换为先前的非零值(滞后) [英] Replace 0's with previous non-zero value per ID (lag)

查看:86
本文介绍了将每个ID的0替换为先前的非零值(滞后)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何用R中每个ID的最后一个非零值替换全0?

How can I replace all 0's with the last non-zero value per ID in R?

示例:

输入:

df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
         Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0)) 

输出:

df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
         Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0),
         res = c(0,10,30,30,30,50,80,0,0,57,57))

有滞后功能的简单方法吗?

Is there an easy way with lag function?

推荐答案

这是一种简单的方法:

library(tidyverse)
df %>% 
  group_by(ID) %>% 
  mutate(x = replace(Var1, cumsum(Var1 !=0) > 0 & Var1 == 0, NA)) %>% 
  fill(x)
# # A tibble: 11 x 4
# # Groups:   ID [2]
# ID  Var1   res     x
# <dbl> <dbl> <dbl> <dbl>
# 1    1.    0.    0.    0.
# 2    1.   10.   10.   10.
# 3    1.   30.   30.   30.
# 4    1.    0.   30.   30.
# 5    1.    0.   30.   30.
# 6    1.   50.   50.   50.
# 7    1.   80.   80.   80.
# 8    2.    0.    0.    0.
# 9    2.    0.    0.    0.
# 10    2.   57.   57.   57.
# 11    2.    0.   57.   57.

在变异步骤中,我们将0替换为NA,但在每次ID运行开始时除外,因为在这种情况下,我们之后没有替换NA的值.

In the mutate step, we replace 0's with NA except for those that are at the beginning of each ID-run because in those cases we have no values to replace the NAs afterwards.

如果要调整多个列,则可以使用:

If you have multiple columns to adjust, you can use:

df %>% 
  group_by(ID) %>% 
  mutate_at(vars(starts_with("Var")), 
            funs(replace(., cumsum(. !=0) > 0 & . == 0, NA))) %>% 
  fill(starts_with("Var"))

其中df可能是:

df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
                 Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0),
                 Var2 = c(4,0, 30, 0, 0,50,0,16, 0, 57, 0)) 

这篇关于将每个ID的0替换为先前的非零值(滞后)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆