如何基于列内爆(大 pandas 反转爆炸) [英] How to implode(reverse of pandas explode) based on a column
问题描述
我有一个如下所示的数据框df
I have a dataframe df like below
NETWORK config_id APPLICABLE_DAYS Case Delivery
0 Grocery 5399 SUN 10 1
1 Grocery 5399 MON 20 2
2 Grocery 5399 TUE 30 3
3 Grocery 5399 WED 40 4
我想进行内爆(将多行的Applicable_days合并为单行,如下所示),并获取每个config_id的平均情况和交付情况
I want to implode( combine Applicable_days from multiple rows into single row like below) and get the average case and delivery per config_id
NETWORK config_id APPLICABLE_DAYS Avg_Cases Avg_Delivery
0 Grocery 5399 SUN,MON,TUE,WED 90 10
使用网络上的groupby,config_id,我可以得到如下所示的avg_cases和avg_delivery.
using the groupby on network,config_id i can get the avg_cases and avg_delivery like below.
df.groupby(['network','config_id']).agg({'case':'mean','delivery':'mean'})
但是在执行此聚合时我如何能够加入APPLICABLE_DAYS?
But How do i be able to join APPLICABLE_DAYS while performing this aggregation?
推荐答案
爆炸,则意味着将其放入解决方案#1中的列表.您也可以加入刺痛方式加入解决方案#2:
If you want the "opposite" of explode, then that means bringing it into a list in Solution #1. You can also join as a sting in Solution #2:
在 .agg
groupby函数中的'APPLICABLE_DAYS'
列中使用 lambda x:x.tolist()
:
Use lambda x: x.tolist()
for the 'APPLICABLE_DAYS'
column within your .agg
groupby function:
df = (df.groupby(['NETWORK','config_id'])
.agg({'APPLICABLE_DAYS': lambda x: x.tolist(),'Case':'mean','Delivery':'mean'})
.rename({'Case' : 'Avg_Cases','Delivery' : 'Avg_Delivery'},axis=1)
.reset_index())
df
Out[1]:
NETWORK config_id APPLICABLE_DAYS Avg_Cases Avg_Delivery
0 Grocery 5399 [SUN, MON, TUE, WED] 25 2.5
在 .agg
groupby函数中的'APPLICABLE_DAYS'
列中使用 lambda x:,.join(x)
:
Use lambda x: ",".join(x)
for the 'APPLICABLE_DAYS'
column within your .agg
groupby function:
df = (df.groupby(['NETWORK','config_id'])
.agg({'APPLICABLE_DAYS': lambda x: ",".join(x),'Case':'mean','Delivery':'mean'})
.rename({'Case' : 'Avg_Cases','Delivery' : 'Avg_Delivery'},axis=1)
.reset_index())
df
Out[1]:
NETWORK config_id APPLICABLE_DAYS Avg_Cases Avg_Delivery
0 Grocery 5399 SUN,MON,TUE,WED 25 2.5
如果您要查找 sum
,则可以将 Cases
的 mean
更改为 sum
.和 Delivery
列.
If you are looking for the sum
, then you can just change mean
to sum
for the Cases
and Delivery
columns.
这篇关于如何基于列内爆(大 pandas 反转爆炸)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!