在 R 的数据框中插入带零的行 [英] Insert rows with zeros in data frames in R

查看:58
本文介绍了在 R 的数据框中插入带零的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑这样一个碎片化的数据集:

Consider a fragmented dataset like this:

   ID       Date Value
1   1 2012-01-01  5065
4   1 2012-01-04  1508
5   1 2012-01-05  9489
6   1 2012-01-06  7613
7   2 2012-01-07  6896
8   2 2012-01-08  2643
11  3 2012-01-02  7294
12  3 2012-01-03  8726
13  3 2012-01-04  6262
14  3 2012-01-05  2999
15  3 2012-01-06 10000
16  3 2012-01-07  1405
18  3 2012-01-09  8372

请注意,对于 (2,3,9,10,17) 缺少观察.我想要的是用Value"= 0 来填充数据集中的一些空白,如下所示:

Notice that observations are missing for (2,3,9,10,17). What I would like, is to fill out some of these gaps in the dataset with "Value" = 0, like so:

   ID       Date Value
1   1 2012-01-01  5920
2   1 2012-01-02     0
3   1 2012-01-03     0
4   1 2012-01-04  8377
5   1 2012-01-05  7810
6   1 2012-01-06  6452
7   2 2012-01-07  3483
8   2 2012-01-08  5426
9   2 2012-01-09     0
11  3 2012-01-02  7854
12  3 2012-01-03  1948
13  3 2012-01-04  7141
14  3 2012-01-05  5402
15  3 2012-01-06  6412
16  3 2012-01-07  7043
17  3 2012-01-08     0
18  3 2012-01-09  3270

关键是只有在对相同(分组)ID 有过去的观察时才应该插入零.我想避免任何循环,因为完整的数据集非常大.

The point is that the zeros only should be inserted if there is a past observation for the same (grouped) ID. I would like to avoid any loops, as the full dataset is quite large.

有什么建议吗?重现数据框:

Any suggestions? To reproduce the dataframe:

df <- data.frame(matrix(0, nrow = 18, ncol = 3,
                  dimnames = list(NULL, c("ID","Date","Value"))) )
df[,1] = c(1,1,1,1,1,1,2,2,2,3,3,3,3,3,3,3,3,3) 
df[,2] = seq(as.Date("2012-01-01"),
             as.Date("2012-01-9"), 
             by=1)
df[,3] = sample(1000:10000,18,replace=T)
df = df[-c(2,3,9,10,17),]

推荐答案

这里已经有一些可靠的答案,但我建议查看软件包 padr.

There are already some solid answers here, but I would recommend checking out the package padr.

library(dplyr)
library(padr)

df %>% 
  pad(start_val = as.Date("2012-01-01"),
      end_val =   as.Date("2012-01-09"),
      group = "ID") %>% 
  fill_by_value(Value)

该包还提供了一些非常直观的函数来汇总日期列.

The package gives some pretty intuitive functions for summarizing Date columns as well.

这篇关于在 R 的数据框中插入带零的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆