pandas 在聚合列上合并 [英] Pandas merge on aggregated columns

查看:70
本文介绍了 pandas 在聚合列上合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我创建一个DataFrame:

Let's say I create a DataFrame:

import pandas as pd
df = pd.DataFrame({"a": [1,2,3,13,15], "b": [4,5,6,6,6], "c": ["wish", "you","were", "here", "here"]})

像这样:

    a   b   c
0   1   4   wish
1   2   5   you
2   3   6   were
3   13  6   here
4   15  6   here

...然后按几列进行分组和汇总...

... and then group and aggregate by a couple columns ...

gb = df.groupby(['b','c']).agg({"a": lambda x: x.nunique()})

产生以下结果:

            a
b   c   
4   wish    1
5   you     1
6   here    2
    were    1

是否可以将df与新聚合的表gb合并,以便在df中创建一个新列,其中包含来自gb的相应值?像这样:

Is it possible to merge df with the newly aggregated table gb such that I create a new column in df, containing the corresponding values from gb? Like this:

    a   b   c      nc
0   1   4   wish    1
1   2   5   you     1
2   3   6   were    1
3   13  6   here    2
4   15  6   here    2

我尝试做最简单的事情:

I tried doing the simplest thing:

df.merge(gb, on=['b','c'])

但这会导致错误:

KeyError: 'b'

之所以有意义,是因为分组表具有多索引并且b不是列.所以我的问题有两个:

Which makes sense because the grouped table has a Multi-index and b is not a column. So my question is two-fold:

  1. 是否可以将gb DataFrame的多索引转换回列(以使其具有bc列)?
  2. 我可以在列名称上将dfgb合并吗?
  1. Can I transform the multi-index of the gb DataFrame back into columns (so that it has the b and c column)?
  2. Can I merge df with gb on the column names?

推荐答案

每当您要将groupby操作中的某些聚合列添加回df时,都应使用

Whenever you want to add some aggregated column from groupby operation back to the df you should be using transform, this produces a Series with its index aligned with your orig df:

In [4]:

df['nc'] = df.groupby(['b','c'])['a'].transform(pd.Series.nunique)
df
Out[4]:
    a  b     c  nc
0   1  4  wish   1
1   2  5   you   1
2   3  6  were   1
3  13  6  here   2
4  15  6  here   2

无需重置索引或执行其他合并.

There is no need to reset the index or perform an additional merge.

这篇关于 pandas 在聚合列上合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆