按多列进行汇总,将一列加起来,然后保留其他列?根据汇总值创建新列? [英] Aggregate by multiple columns, sum one column and keep other columns? Create new column based on aggregated values?

查看:66
本文介绍了按多列进行汇总,将一列加起来,然后保留其他列?根据汇总值创建新列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有销售量的数据框.我需要按2列 ProductID Day 汇总df,并汇总来自不同列 Amount 的每个汇总组的值,以便现在显示总数.我希望保留其他可以分组的列(跨行相同的值),在这种情况下,只需 Product .最后一列 Store 将不会保留,因为值在分组的行中可能会有所不同.但是,我需要添加一列 UniqueStores ,该列计算具有相同ProductID和Day的每组的唯一存储量.例如,ID = 1和Day = Monday的第一组将具有1个唯一的商店"N",因此值将为1.

I have a dataframe with sales. I need to aggregate the df by 2 columns ProductID and Day and sum the values of each aggregated group from a different column Amount so that it now shows the total. I wish to keep the other columns that can be grouped as well (same values across rows), in this case just Product. The final column Store won't be kept, since values can vary within grouped rows. However, I need to add a column UniqueStores, which counts the amount of unique stores for each group of same ProductID and Day. For example, the first group with ID=1 and Day= Monday would have 1 unique store "N", so value would be 1.

我尝试在此处以文本形式绘制表格,但无法正确设置其格式,因此此处显示的是表格在汇总之前的外观:

I tried drafting the table here in text but I couldn't format it correctly, so here as an image of how it looks before aggregating:

我尝试使用group_by + summary和df [,sum,by]进行聚合,但是它们没有保留未作为索引提供的变量.是否有一种解决方法而不必手动插入将要保留的每一列?

I've tried aggregating with both group_by + summarise and df[,sum,by] but they don't keep the variables that aren't given as indexes. Is there a workaround without having to manually insert every column that shall remain?

谢谢,我希望我能说清楚.

Thanks in advance and I hope I made myself clear.

输入值:

df<-data.frame("ProductID" = c(1,1,1,1,2,2,2,2,2),"Day" = c("Monday",星期一",星期二",星期二",星期三",星期三",星期五",星期五"),金额" = c(5,5,3,7),6,9,5,2),产品" = c(食物",食物",食物",食物",玩具",玩具",玩具";,玩具"),商店" = c("N","N","W","N","S","W","S"," S"))

推荐答案

data.table 中:

library(data.table)

setDT(df)[, .(Amount = sum(Amount, na.rm = TRUE),
              UniqueStores = uniqueN(Store, na.rm = TRUE)), 
          by = .(ProductID, Day, Product)
          ]

输出:

   ProductID       Day Product Amount UniqueStores
1:         1    Monday    Food     10            1
2:         1   Tuesday    Food     10            2
3:         2 Wednesday    Toys     15            2
4:         2    Friday    Toys      7            1

这篇关于按多列进行汇总,将一列加起来,然后保留其他列?根据汇总值创建新列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆