按组添加缺少年份的行 [英] Add rows with missing years by group
问题描述
我想在data.frame中为每个组(公司和类型)所有缺少的年份创建新行。数据帧如下所示:
I would like to create new rows in a data.frame for all missing years for each group (firm and type). The dataframe looks as follows:
minimal <- data.frame(firm = c("A","A","A","B","B","B","A","A","A","B","B","B"),
type = c("X","X","X","X","X","X","Y","Y","Y","Y","Y","Y"),
year = c(2000,2004,2007,2010,2008,2001,2002,2003,2007,2000,2001,2008),
value = c(1,3,7,9,9,2,3,3,7,5,9,15)
)
数据帧:
firm type year value
A X 2000 1
A X 2004 3
A X 2007 7
B X 2010 9
B X 2008 9
B X 2001 2
A Y 2002 3
A Y 2003 3
A Y 2007 7
B Y 2000 5
B Y 2001 9
B Y 2008 15
现在,我想得到的是:
我可以在数据中看到,最小年份是2000,最大年份是2010。我想为每个密西西比州添加一行ng年为每种企业类型的组合。
例如对于公司A并输入X,我想添加以下行:
Now, what I want to get is the following: I can see in the data that the minimum year is 2000 and the maximum is 2010. I want to add a row for each missing year for each combination of firm-type. E.g. for firm A and type X, I would like to add rows such that it looks like this:
最终输出:
firm type year value
A X 2000 1
A X 2004 3
A X 2007 7
A X 2001 1
A X 2002 1
A X 2003 1
A X 2005 3
A X 2006 3
A X 2008 7
A X 2009 7
A X 2010 7
此外,我想将前一年的值写到随后所有年份的缺失行的值列中,直到出现新的不丢失行(如最终输出示例所示)。
Additionally, I want to write the value from the previous year into the column 'value' for the missing row for all subsequent years until a new non-missing row appears (as seen in the final output example).
我还没有想出任何有用的代码,但是到目前为止我发现的是以下可能是正确的方向:
I have not yet come up with any useful code, but what I have found so far is the following which might be the right direction:
setDT(minimal)[, .SD[match(2000:2010, year)],
by = c("firm","type")]
我不t确实了解setDT和.SD的概念,但这为每种公司类型组合至少创建一行。但是,一年中没有内容。
I don't really understand the concept of setDT and .SD, but this creates at least one row for each firm type combination. However, there is not content for year.
请多谢!
推荐答案
我找不到确切的副本,因此这是一个可能的解决方案,
I could not find an exact dupe for this so here is a possible solution,
library(dplyr)
library(tidyr)
minimal %>%
group_by(firm, type) %>%
complete(year = full_seq(2000:2010, 1)) %>%
fill(value)
这篇关于按组添加缺少年份的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!