是否有“取消分组依据"?与 pandas 中的.groupby相反的操作? [英] Is there an "ungroup by" operation opposite to .groupby in pandas?
问题描述
假设我们使用一个 Pandas 数据框...
Suppose we take a pandas dataframe...
name age family
0 john 1 1
1 jason 36 1
2 jane 32 1
3 jack 26 2
4 james 30 2
然后做一个 groupby()
...
group_df = df.groupby('family')
group_df = group_df.aggregate({'name': name_join, 'age': pd.np.mean})
然后做一些聚合/汇总操作(在我的例子中,我的函数 name_join
聚合了名字):
Then do some aggregate/summarize operation (in my example, my function name_join
aggregates the names):
def name_join(list_names, concat='-'):
return concat.join(list_names)
分组汇总输出如下:
age name
family
1 23 john-jason-jane
2 28 jack-james
问题:
是否有一种快速有效的方法可以从聚合表中获取以下内容?
Question:
Is there a quick, efficient way to get to the following from the aggregated table?
name age family
0 john 23 1
1 jason 23 1
2 jane 23 1
3 jack 28 2
4 james 28 2
(注意:age
列值只是示例,我不在乎在此特定示例中平均后丢失的信息)
(Note: the age
column values are just examples, I don't care for the information I am losing after averaging in this specific example)
我认为我能做到的方式看起来不太有效:
The way I thought I could do it does not look too efficient:
- 创建空数据框
- 从
group_df
中的每一行,将名称分开 - 返回一个数据框,其行数与起始行中的名称一样多
- 将输出附加到空数据帧
- create empty dataframe
- from every line in
group_df
, separate the names - return a dataframe with as many rows as there are names in the starting row
- append the output to the empty dataframe
推荐答案
粗略的等价物是 .reset_index()
,但想想可能没有帮助作为 groupby()
的对立面".
您正在将一个字符串分成几部分,并保持每个部分与家庭"的关联.这个老答案我的完成了这项工作.
You are splitting a string in to pieces, and maintaining each piece's association with 'family'. This old answer of mine does the job.
只需先将'family'设置为索引列,参考上面的链接,然后在最后的reset_index()
就可以得到你想要的结果.
Just set 'family' as the index column first, refer to the link above, and then reset_index()
at the end to get your desired result.
这篇关于是否有“取消分组依据"?与 pandas 中的.groupby相反的操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!