滞后函数返回NA [英] lag function returns NAs

查看:64
本文介绍了滞后函数返回NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人使用 dplyr 软件包对此结果进行解释吗?

Does anybody have an explanation for such result using dplyr package?

我有一个data.frame df

I have a data.frame df

    library(dplyr)
    df = data_frame(
      'id' = c(1,2,2,2,2,3,3,3,3),
      'start' = c(881, 1611, 1611, 1642, 1764, 0, 0, 28, 59),
      'end' = c(1089, 1819, 1819, 1850, 1972, 208,  208,236, 267))

看起来像

    # Source: local data frame [9 x 3]
    #
    # id start   end
    # (dbl) (dbl) (dbl)
    # 1     1   881  1089
    # 2     2  1611  1819
    # 3     2  1611  1819
    # 4     2  1642  1850
    # 5     2  1764  1972
    # 6     3     0   208
    # 7     3     0   208
    # 8     3    28   236
    # 9     3    59   267

在按 id 分组并在结尾列中应用了滞后之后,我期望每个 id 都缺少一个.

After grouping by id and applying a lag in end column, I was expecting to have one missing for each id.

    df %>% 
      group_by(id) %>%
      mutate(end.prev = lag(end))

但是我有

    # Source: local data frame [9 x 4]
    # Groups: id [3]
    # 
    # id start   end end.prev
    # (dbl) (dbl) (dbl)    (dbl)
    # 1     1   881  1089       NA
    # 2     2  1611  1819       NA
    # 3     2  1611  1819     1819
    # 4     2  1642  1850     1819
    # 5     2  1764  1972     1850
    # 6     3     0   208       NA
    # 7     3     0   208       NA  <- I don't understant this NA
    # 8     3    28   236       NA  <- Neither this one
    # 9     3    59   267       NA  <- nor this other

我正在使用cran dplyr 0.4.3中可用的最新版本(我的R版本是3.2.5)

I am using the last version available in cran dplyr 0.4.3 (my R version is 3.2.5)

推荐答案

我正在使用版本 dplyr 版本 1.0.5 ,它似乎可以正常工作.如果版本不重要,则可以将您的 dplyr 升级到最新版本.

I am using version dplyr version 1.0.5 and it seems to be working. If the version is not important then maybe just upgrade your dplyr to latest version.

library(tidyverse)
df = tibble(
  'id' = c(1,2,2,2,2,3,3,3,3),
  'start' = c(881, 1611, 1611, 1642, 1764, 0, 0, 28, 59),
  'end' = c(1089, 1819, 1819, 1850, 1972, 208,  208,236, 267))

df %>% 
  group_by(id) %>%
  mutate(end.prev = lag(end))
#> # A tibble: 9 x 4
#> # Groups:   id [3]
#>      id start   end end.prev
#>   <dbl> <dbl> <dbl>    <dbl>
#> 1     1   881  1089       NA
#> 2     2  1611  1819       NA
#> 3     2  1611  1819     1819
#> 4     2  1642  1850     1819
#> 5     2  1764  1972     1850
#> 6     3     0   208       NA
#> 7     3     0   208      208
#> 8     3    28   236      208
#> 9     3    59   267      236

reprex程序包(v2.0.0)创建于2021-04-16 (v2.0.0)

Created on 2021-04-16 by the reprex package (v2.0.0)

这篇关于滞后函数返回NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆