R-按组填写缺少的日期 [英] R - Fill missing dates by group

查看:54
本文介绍了R-按组填写缺少的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的数据中,某些月份中存在某些ID的观测值,例如,

In my data, there exist observations for some IDs in some months and not for others, e.g.

dat <- data.frame(c(1, 1, 1, 2, 3, 3, 3, 4, 4, 4), c(rep(30, 2), rep(25, 5), rep(20, 3)), c('2017-01-01', '2017-02-01', '2017-04-01', '2017-02-01', '2017-01-01', '2017-02-01', '2017-03-01', '2017-01-01',
                    '2017-02-01', '2017-04-01'))
colnames(dat) <- c('id', 'value', 'date')

我想为每个id值插入一行,其中包括该id缺少的月份和valueNA缺少的月份.

I would like to, for each id value, insert a row that includes the month(s) missing for that id and NA for value.

seq(min(as.Date(dat$date)), max(as.Date(dat$date)), by = 'months')的所有月份中,是否有办法(某种程度上)简洁地执行此操作?我经常使用tidyverse和data.table,但可以接受任何方法.

Is there a way to (somewhat) concisely do this for all months in seq(min(as.Date(dat$date)), max(as.Date(dat$date)), by = 'months')? I often use tidyverse and data.table, but am open to any approach.

推荐答案

tidyr::complete()填充缺少的值

添加iddate作为要扩展的列(...)

tidyr::complete() fills missing values

add id and date as the columns (...) to expand for

library(tidyverse)

complete(dat, id, date)


# A tibble: 16 x 3
      id date       value
   <dbl> <date>     <dbl>
 1  1.00 2017-01-01  30.0
 2  1.00 2017-02-01  30.0
 3  1.00 2017-03-01  NA  
 4  1.00 2017-04-01  25.0
 5  2.00 2017-01-01  NA  
 6  2.00 2017-02-01  25.0
 7  2.00 2017-03-01  NA  
 8  2.00 2017-04-01  NA  
 9  3.00 2017-01-01  25.0
10  3.00 2017-02-01  25.0
11  3.00 2017-03-01  25.0
12  3.00 2017-04-01  NA  
13  4.00 2017-01-01  20.0
14  4.00 2017-02-01  20.0
15  4.00 2017-03-01  NA  
16  4.00 2017-04-01  20.0

这篇关于R-按组填写缺少的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆