是否有R函数用于按组连续输入丢失的年份值? [英] Is there an R function for imputing missing year values, consecutively, by group?

查看:0
本文介绍了是否有R函数用于按组连续输入丢失的年份值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据帧如下:

df <- data.frame(ID=c("A", "A", "A", "A", 
                      "B", "B", "B", "B",
                      "C", "C", "C", "C",
                      "D", "D", "D", "D"),
                 grade=c("KG", "01", "02", "03",
                         "KG", "01", "02", "03",
                         "KG", "01", "02", "03",
                         "KG", "01", "02", "03"),
                 year=c(2002, 2003, NA, 2005,
                        2007, NA, NA, 2010,
                        NA, 2005, 2006, NA,
                        2009, 2010, NA, NA))

我希望能够通过ID来计算丢失的year值,结果如下:

wanted_df <- data.frame(ID=c("A", "A", "A", "A", 
                             "B", "B", "B", "B",
                             "C", "C", "C", "C",
                             "D", "D", "D", "D"),
                       grade=c("KG", "01", "02", "03",
                               "KG", "01", "02", "03",
                               "KG", "01", "02", "03",
                               "KG", "01", "02", "03"),
                       year=c(2002, 2003, 2004, 2005,
                              2007, 2008, 2009, 2010,
                              2004, 2005, 2006, 2007,
                              2009, 2010, 2011, 2012))

我已尝试使用以下工具计算值:

  • lag()lead()函数
  • 加入由年份组成的数据帧

都没有奏效。任何帮助都将不胜感激。谢谢。

推荐答案

我们可以使用na_interpolate/na_extrapolate

library(dplyr)
# remotes::install_github("skgrange/threadr")
library(threadr)
df %>% 
   group_by(ID) %>% 
   mutate(year = na_extrapolate(na_interpolate(year))) %>%
   ungroup

-输出

# A tibble: 16 × 3
   ID    grade  year
   <chr> <chr> <dbl>
 1 A     KG    2002 
 2 A     01    2003 
 3 A     02    2004 
 4 A     03    2005 
 5 B     KG    2007 
 6 B     01    2008 
 7 B     02    2009 
 8 B     03    2010 
 9 C     KG    2004.
10 C     01    2005 
11 C     02    2006 
12 C     03    2007 
13 D     KG    2009 
14 D     01    2010 
15 D     02    2011 
16 D     03    2012.

这篇关于是否有R函数用于按组连续输入丢失的年份值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆