如何基于列内爆(大 pandas 反转爆炸) [英] How to implode(reverse of pandas explode) based on a column

查看:96
本文介绍了如何基于列内爆(大 pandas 反转爆炸)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的数据框df

I have a dataframe df like below

  NETWORK       config_id       APPLICABLE_DAYS  Case    Delivery  
0   Grocery     5399            SUN               10       1        
1   Grocery     5399            MON               20       2       
2   Grocery     5399            TUE               30       3        
3   Grocery     5399            WED               40       4       

我想进行内爆(将多行的Applicable_days合并为单行,如下所示),并获取每个config_id的平均情况和交付情况

I want to implode( combine Applicable_days from multiple rows into single row like below) and get the average case and delivery per config_id

  NETWORK       config_id       APPLICABLE_DAYS      Avg_Cases    Avg_Delivery 
0   Grocery     5399            SUN,MON,TUE,WED         90           10

使用网络上的groupby,config_id,我可以得到如下所示的avg_cases和avg_delivery.

using the groupby on network,config_id i can get the avg_cases and avg_delivery like below.

df.groupby(['network','config_id']).agg({'case':'mean','delivery':'mean'})

但是在执行此聚合时我如何能够加入APPLICABLE_DAYS?

But How do i be able to join APPLICABLE_DAYS while performing this aggregation?

推荐答案

爆炸,则意味着将其放入解决方案#1中的列表.您也可以加入刺痛方式加入解决方案#2:

If you want the "opposite" of explode, then that means bringing it into a list in Solution #1. You can also join as a sting in Solution #2:

.agg groupby函数中的'APPLICABLE_DAYS'列中使用 lambda x:x.tolist():

Use lambda x: x.tolist() for the 'APPLICABLE_DAYS' column within your .agg groupby function:

df = (df.groupby(['NETWORK','config_id'])
      .agg({'APPLICABLE_DAYS': lambda x: x.tolist(),'Case':'mean','Delivery':'mean'})
      .rename({'Case' : 'Avg_Cases','Delivery' : 'Avg_Delivery'},axis=1)
      .reset_index())
df
Out[1]: 
   NETWORK  config_id       APPLICABLE_DAYS  Avg_Cases  Avg_Delivery
0  Grocery       5399  [SUN, MON, TUE, WED]         25           2.5


.agg groupby函数中的'APPLICABLE_DAYS'列中使用 lambda x:,.join(x):


Use lambda x: ",".join(x) for the 'APPLICABLE_DAYS' column within your .agg groupby function:

 df = (df.groupby(['NETWORK','config_id'])
      .agg({'APPLICABLE_DAYS': lambda x: ",".join(x),'Case':'mean','Delivery':'mean'})
      .rename({'Case' : 'Avg_Cases','Delivery' : 'Avg_Delivery'},axis=1)
      .reset_index())
df
Out[1]: 
   NETWORK  config_id       APPLICABLE_DAYS  Avg_Cases  Avg_Delivery
0  Grocery       5399       SUN,MON,TUE,WED         25           2.5

如果您要查找 sum ,则可以将 Cases mean 更改为 sum .和 Delivery 列.

If you are looking for the sum, then you can just change mean to sum for the Cases and Delivery columns.

这篇关于如何基于列内爆(大 pandas 反转爆炸)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆