是否有“取消分组依据"?与 pandas 中的.groupby相反的操作? [英] Is there an "ungroup by" operation opposite to .groupby in pandas?

查看:31
本文介绍了是否有“取消分组依据"?与 pandas 中的.groupby相反的操作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们使用一个 Pandas 数据框...

Suppose we take a pandas dataframe...

    name  age  family
0   john    1       1
1  jason   36       1
2   jane   32       1
3   jack   26       2
4  james   30       2

然后做一个 groupby() ...

group_df = df.groupby('family')
group_df = group_df.aggregate({'name': name_join, 'age': pd.np.mean})

然后做一些聚合/汇总操作(在我的例子中,我的函数 name_join 聚合了名字):

Then do some aggregate/summarize operation (in my example, my function name_join aggregates the names):

def name_join(list_names, concat='-'):
    return concat.join(list_names)

分组汇总输出如下:

        age             name
family                      
1        23  john-jason-jane
2        28       jack-james

问题:

是否有一种快速有效的方法可以从聚合表中获取以下内容?

Question:

Is there a quick, efficient way to get to the following from the aggregated table?

    name  age  family
0   john   23       1
1  jason   23       1
2   jane   23       1
3   jack   28       2
4  james   28       2

(注意:age 列值只是示例,我不在乎在此特定示例中平均后丢失的信息)

(Note: the age column values are just examples, I don't care for the information I am losing after averaging in this specific example)

我认为我能做到的方式看起来不太有效:

The way I thought I could do it does not look too efficient:

  1. 创建空数据框
  2. group_df 中的每一行,将名称分开
  3. 返回一个数据框,其行数与起始行中的名称一样多
  4. 将输出附加到空数据帧
  1. create empty dataframe
  2. from every line in group_df, separate the names
  3. return a dataframe with as many rows as there are names in the starting row
  4. append the output to the empty dataframe

推荐答案

粗略的等价物是 .reset_index(),但想想可能没有帮助作为 groupby() 的对立面".

您正在将一个字符串分成几部分,并保持每个部分与家庭"的关联.这个老答案我的完成了这项工作.

You are splitting a string in to pieces, and maintaining each piece's association with 'family'. This old answer of mine does the job.

只需先将'family'设置为索引列,参考上面的链接,然后在最后的reset_index()就可以得到你想要的结果.

Just set 'family' as the index column first, refer to the link above, and then reset_index() at the end to get your desired result.

这篇关于是否有“取消分组依据"?与 pandas 中的.groupby相反的操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆