按组添加缺少年份的行 [英] Add rows with missing years by group

查看:64
本文介绍了按组添加缺少年份的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在data.frame中为每个组(公司和类型)所有缺少的年份创建新行。数据帧如下所示:

I would like to create new rows in a data.frame for all missing years for each group (firm and type). The dataframe looks as follows:

minimal <- data.frame(firm = c("A","A","A","B","B","B","A","A","A","B","B","B"),
                  type = c("X","X","X","X","X","X","Y","Y","Y","Y","Y","Y"),
                  year = c(2000,2004,2007,2010,2008,2001,2002,2003,2007,2000,2001,2008),
                  value = c(1,3,7,9,9,2,3,3,7,5,9,15)
                  )

数据帧:

firm type year value
A    X    2000     1
A    X    2004     3
A    X    2007     7
B    X    2010     9
B    X    2008     9
B    X    2001     2
A    Y    2002     3
A    Y    2003     3
A    Y    2007     7
B    Y    2000     5
B    Y    2001     9
B    Y    2008    15

现在,我想得到的是:
我可以在数据中看到,最小年份是2000,最大年份是2010。我想为每个密西西比州添加一行ng年为每种企业类型的组合。
例如对于公司A并输入X,我想添加以下行:

Now, what I want to get is the following: I can see in the data that the minimum year is 2000 and the maximum is 2010. I want to add a row for each missing year for each combination of firm-type. E.g. for firm A and type X, I would like to add rows such that it looks like this:

最终输出:

firm type year value
A    X    2000     1
A    X    2004     3
A    X    2007     7
A    X    2001     1
A    X    2002     1
A    X    2003     1
A    X    2005     3
A    X    2006     3
A    X    2008     7
A    X    2009     7
A    X    2010     7

此外,我想将前一年的值写到随后所有年份的缺失行的值列中,直到出现新的不丢失行(如最终输出示例所示)。

Additionally, I want to write the value from the previous year into the column 'value' for the missing row for all subsequent years until a new non-missing row appears (as seen in the final output example).

我还没有想出任何有用的代码,但是到目前为止我发现的是以下可能是正确的方向:

I have not yet come up with any useful code, but what I have found so far is the following which might be the right direction:

setDT(minimal)[, .SD[match(2000:2010, year)],
                           by = c("firm","type")]

我不t确实了解setDT和.SD的概念,但这为每种公司类型组合至少创建一行。但是,一年中没有内容。

I don't really understand the concept of setDT and .SD, but this creates at least one row for each firm type combination. However, there is not content for year.

请多谢!

推荐答案

我找不到确切的副本,因此这是一个可能的解决方案,

I could not find an exact dupe for this so here is a possible solution,

library(dplyr)
library(tidyr)

minimal %>% 
  group_by(firm, type) %>% 
  complete(year = full_seq(2000:2010, 1)) %>% 
  fill(value)

这篇关于按组添加缺少年份的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆