使用dplyr将行添加到分组数据？ [英] Add rows to grouped data with dplyr?

查看：120 发布时间：2017/3/25 23:21:19 r dataframe dplyr

本文介绍了使用dplyr将行添加到分组数据？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的数据是像这样的数据的数据框架格式：

  data < -  
 structure （列表（文章=结构（c（1L，1L，3L，1L，1L，1L，
 1L，1L，1L，1L，1L，1L，1L，1L，1L，1L，2L，1L， ，1L，2L，1L 
），.Label = c（10004，10006，10007），class =factor），
 Demand = c（26L，780L，2L ，181L，228L，214L，219L，291L，104L，
 72L，155L，237L，182L，148L，52L，227L，2L，355L，2L，432L，
 1L，156L） c（2013-W01，2013-W01，2013-W01，2013-W01，
2013-W01，2013-W02，2013-W02 -W02，2013-W02，2013-W02，2013-W03，2013-W03，2013-W03，2013-W03，
 -W03，2013-W04，2013-W04，2013-W04，2013-W04，
2013-W04，2013-W04 ），.Names = c（Article，
Demand，Week），class =data.frame，row.names = c（NA，-22L））

我想按周和文章总结需求列。为此，我使用：

 库（dplyr）
 WeekSums<  -  
 data％> ;％
 group_by（文章，周）％>％
总结（
 WeekDemand = sum（Demand）
）
  pre> 
 
 但是由于某些文章在某些星期内未出售，因此每篇文章的行数不同（只有星期的销售额显示在WeekSums数据框中）。如何调整我的数据，以便每篇文章的行数相同（每周一次），包括需求为零的星期？
 
 
 输出应该看起来像这个：
 文章周WeekDemand 
 1 10004 2013-W01 1215 
 2 10004 2013-W02 900 
 3 10004 2013-W03 774 
 4 10004 2013-W04 1170 
 5 10006 2013-W01 0 
 6 10006 2013-W02 0 
 7 10006 2013-W03 0 
 8 10006 2013-W04 5 
 9 10007 2013-W01 2 
 10 10007 2013-W02 0 
 11 10007 2013-W03 0 
 12 10007 2013-W04 0 
  
我尝试过
  WeekSums％>％
 group_by（Article）％>％
 if（n（）< 4）rep（rbind（c（Article，NA，NA））， n（））
  
但这不行。在我的原始方法中，我通过将每周1-4的数据帧与每个文章的rawdata文件合并来解决了这个问题。这样，我每篇文章都有4周（行），但是使用for循环的实现是非常低效的，所以我试图用dplyr（或任何其他更有效的包/函数）做同样的事情。任何建议将非常感谢！
解决方案
没有dplyr可以这样做：
 > as.data.frame（xtabs（需求〜周+文章，数据））
周文章Freq 
 1 2013-W01 10004 1215 
 2 2013-W02 10004 900 
 3 2013 -W03 10004 774 
 4 2013-W04 10004 1170 
 5 2013-W01 10006 0 
 6 2013-W02 10006 0 
 7 2013-W03 10006 0 
 8 2013 -W04 10006 5 
 9 2013-W01 10007 2 
 10 2013-W02 10007 0 
 11 2013-W03 10007 0 
 12 2013-W04 10007 0 
  
，这可以重写为dplyr管道，如下所示：
  data％>％xtabs（formula = Demand〜Week + Article）％>％as.data.frame（）
  
如果需要广泛的解决方案，最终可能会忽略 as.data.frame（） / p> 
My data is in a data.frame format like this sample data:
data <- 
structure(list(Article = structure(c(1L, 1L, 3L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L
), .Label = c("10004", "10006", "10007"), class = "factor"), 
Demand = c(26L, 780L, 2L, 181L, 228L, 214L, 219L, 291L, 104L, 
72L, 155L, 237L, 182L, 148L, 52L, 227L, 2L, 355L, 2L, 432L, 
1L, 156L), Week = c("2013-W01", "2013-W01", "2013-W01", "2013-W01", 
"2013-W01", "2013-W02", "2013-W02", "2013-W02", "2013-W02", 
"2013-W02", "2013-W03", "2013-W03", "2013-W03", "2013-W03", 
"2013-W03", "2013-W04", "2013-W04", "2013-W04", "2013-W04", 
"2013-W04", "2013-W04", "2013-W04")), .Names = c("Article", 
"Demand", "Week"), class = "data.frame", row.names = c(NA, -22L))
I would like to summarize the demand column by week and article. To do this, I use: 
library(dplyr)
WeekSums <- 
  data %>%
   group_by(Article, Week) %>%
   summarize(
    WeekDemand = sum(Demand)
   )
But because some articles were not sold in certain weeks, the number of rows per article differs (only weeks with sales are shown in the WeekSums dataframe). How could I adjust my data so that each article has the same number of rows (one for each week), including weeks with 0 demand?


The output should then look like this:
  Article     Week WeekDemand
1   10004 2013-W01       1215
2   10004 2013-W02        900
3   10004 2013-W03        774
4   10004 2013-W04       1170
5   10006 2013-W01        0
6   10006 2013-W02        0
7   10006 2013-W03        0
8   10006 2013-W04         5
9   10007 2013-W01         2
10   10007 2013-W02        0
11   10007 2013-W03        0
12   10007 2013-W04        0
I tried 
WeekSums %>%
  group_by(Article) %>%
  if(n()< 4) rep(rbind(c(Article,NA,NA)), 4 - n() )
but this doesn’t work. In my original approach, I resolved this problem by merging a dataframe of week numbers 1-4 with my rawdata file for each article. That way, I got 4 weeks (rows) per article, but the implementation with a for loop is very inefficient and so I’m trying to do the same with dplyr (or any other more efficient package/function). Any suggestions would be much appreciated!
 解决方案 
Without dplyr it can be done like this:
> as.data.frame(xtabs(Demand ~ Week + Article, data))
       Week Article Freq
1  2013-W01   10004 1215
2  2013-W02   10004  900
3  2013-W03   10004  774
4  2013-W04   10004 1170
5  2013-W01   10006    0
6  2013-W02   10006    0
7  2013-W03   10006    0
8  2013-W04   10006    5
9  2013-W01   10007    2
10 2013-W02   10007    0
11 2013-W03   10007    0
12 2013-W04   10007    0
and this can be rewritten as a dplyr pipeline like this:
data %>% xtabs(formula = Demand ~ Week + Article) %>% as.data.frame()
The as.data.frame() at the end could be omitted if a wide form solution were desired.

                        这篇关于使用dplyr将行添加到分组数据？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

使用dplyr将行添加到分组数据？ [英] Add rows to grouped data with dplyr?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用dplyr将行添加到分组数据？ [英] Add rows to grouped data with dplyr?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭