R:为销售数据中的缺失观测值插入行 [英] R: inserting rows for missing observations in sales data
问题描述
这可能是重复的。我曾尝试寻找解决方案,但大部分都无法提出,因为我真的不知道如何提出问题。因此,我将提供一个工作示例:
This might be a duplicate. I tried searching for a solution but couldn't come up with one mostly because I don't really know how to frame my question. So I will include a working example:
想象一下我有这个df:
Imagine I have this df:
df <- x <- data.frame(Product = c("A", "A", "A", "B","B", "C", "C", "C", "C", "C"), Year = c(2014, 2017, 2018, 2017, 2018, 2013, 2014, 2016, 2017, 2018), Sales = c(4, 2, 3, 5, 1, 3, 3, 4, 7, 5))
我想做的是:
在2013:2019范围内,即使该年未销售该产品,也要为其每年为每个产品添加一行。因此,我想要的输出将是:
What I want to do is: in the range 2013:2019, add a row for each product for each year even though the product was not sold in that year. So my desired output would be like:
Product Year Sales
A 2013 0
A 2014 4
A 2015 0
A 2016 0
A 2017 2
A 2018 3
A 2019 0
感谢您的帮助。
推荐答案
我们可以使用 tidyr :: complete
tidyr::complete(df,Product,Year = seq(min(Year), max(Year)), fill=list(Sales = 0))
# Product Year Sales
# <fct> <dbl> <dbl>
# 1 A 2013 0
# 2 A 2014 4
# 3 A 2015 0
# 4 A 2016 0
# 5 A 2017 2
# 6 A 2018 3
# 7 B 2013 0
# 8 B 2014 0
# 9 B 2015 0
#....
如果必须固定范围( 2013:2019
),而不考虑数据,我们可以明确指定它。
If the range has to be fixed (2013:2019
) irrespective of the years in the data, we can specify it explicitly.
tidyr::complete(df, Product, Year = 2013:2019, fill = list(Sales = 0))
这篇关于R:为销售数据中的缺失观测值插入行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!