R-按组填写缺少的日期 [英] R - Fill missing dates by group
本文介绍了R-按组填写缺少的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在我的数据中,某些月份中存在某些ID的观测值,例如,
In my data, there exist observations for some IDs in some months and not for others, e.g.
dat <- data.frame(c(1, 1, 1, 2, 3, 3, 3, 4, 4, 4), c(rep(30, 2), rep(25, 5), rep(20, 3)), c('2017-01-01', '2017-02-01', '2017-04-01', '2017-02-01', '2017-01-01', '2017-02-01', '2017-03-01', '2017-01-01',
'2017-02-01', '2017-04-01'))
colnames(dat) <- c('id', 'value', 'date')
我想为每个id
值插入一行,其中包括该id
缺少的月份和value
的NA
缺少的月份.
I would like to, for each id
value, insert a row that includes the month(s) missing for that id
and NA
for value
.
在seq(min(as.Date(dat$date)), max(as.Date(dat$date)), by = 'months')
的所有月份中,是否有办法(某种程度上)简洁地执行此操作?我经常使用tidyverse和data.table,但可以接受任何方法.
Is there a way to (somewhat) concisely do this for all months in seq(min(as.Date(dat$date)), max(as.Date(dat$date)), by = 'months')
? I often use tidyverse and data.table, but am open to any approach.
推荐答案
tidyr::complete()
填充缺少的值
添加id
和date
作为要扩展的列(...
)
tidyr::complete()
fills missing values
add id
and date
as the columns (...
) to expand for
library(tidyverse)
complete(dat, id, date)
# A tibble: 16 x 3
id date value
<dbl> <date> <dbl>
1 1.00 2017-01-01 30.0
2 1.00 2017-02-01 30.0
3 1.00 2017-03-01 NA
4 1.00 2017-04-01 25.0
5 2.00 2017-01-01 NA
6 2.00 2017-02-01 25.0
7 2.00 2017-03-01 NA
8 2.00 2017-04-01 NA
9 3.00 2017-01-01 25.0
10 3.00 2017-02-01 25.0
11 3.00 2017-03-01 25.0
12 3.00 2017-04-01 NA
13 4.00 2017-01-01 20.0
14 4.00 2017-02-01 20.0
15 4.00 2017-03-01 NA
16 4.00 2017-04-01 20.0
这篇关于R-按组填写缺少的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文