R:为销售数据中的缺失观测值插入行 [英] R: inserting rows for missing observations in sales data

查看:85
本文介绍了R:为销售数据中的缺失观测值插入行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能是重复的。我曾尝试寻找解决方案,但大部分都无法提出,因为我真的不知道如何提出问题。因此,我将提供一个工作示例:

This might be a duplicate. I tried searching for a solution but couldn't come up with one mostly because I don't really know how to frame my question. So I will include a working example:

想象一下我有这个df:

Imagine I have this df:

 df <- x <- data.frame(Product = c("A", "A", "A", "B","B", "C", "C", "C", "C", "C"), Year = c(2014, 2017, 2018, 2017, 2018, 2013, 2014, 2016, 2017, 2018), Sales  = c(4, 2, 3, 5, 1, 3, 3, 4, 7, 5))

我想做的是:
在2013:2019范围内,即使该年未销售该产品,也要为其每年为每个产品添加一行。因此,我想要的输出将是:

What I want to do is: in the range 2013:2019, add a row for each product for each year even though the product was not sold in that year. So my desired output would be like:

Product   Year   Sales
    A     2013       0
    A     2014       4
    A     2015       0
    A     2016       0
    A     2017       2
    A     2018       3
    A     2019       0

感谢您的帮助。

推荐答案

我们可以使用 tidyr :: complete

tidyr::complete(df,Product,Year = seq(min(Year), max(Year)), fill=list(Sales = 0))

#  Product  Year Sales
#  <fct>   <dbl> <dbl>
# 1 A        2013     0
# 2 A        2014     4
# 3 A        2015     0
# 4 A        2016     0
# 5 A        2017     2
# 6 A        2018     3
# 7 B        2013     0
# 8 B        2014     0
# 9 B        2015     0
#....

如果必须固定范围( 2013:2019 ),而不考虑数据,我们可以明确指定它。

If the range has to be fixed (2013:2019) irrespective of the years in the data, we can specify it explicitly.

tidyr::complete(df, Product, Year = 2013:2019, fill = list(Sales = 0))

这篇关于R:为销售数据中的缺失观测值插入行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆