Groupby创建新列 [英] Groupby to create new columns

查看:58
本文介绍了Groupby创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果已经找到索引,我想从一个数据框创建一个带有新列的数据框,但是我不知道我将创建多少列:

From a dataframe, I want to create a dataframe with new columns if the index is already found BUT I don't know how many columns I will create :

pd.DataFrame([["John","guitar"],["Michael","football"],["Andrew","running"],["John","dancing"],["Andrew","cars"]])

我想要:

pd.DataFrame([["John","guitar","dancing"],["Michael","Football",None],["Andrew","running","cars"]])

不知道我一开始应该创建多少列.

without knowing how many columns I should create at the start.

推荐答案

使用

Use GroupBy.cumcount for get counter and then reshape by unstack:

df1 = pd.DataFrame([["John","guitar"],
                    ["Michael","football"],
                    ["Andrew","running"],
                    ["John","dancing"],
                    ["Andrew","cars"]], columns=['a','b'])

         a         b
0     John    guitar
1  Michael  football
2   Andrew   running
3     John   dancing
4   Andrew      cars


df = (df1.set_index(['a', df1.groupby('a').cumcount()])['b']
         .unstack()
         .rename_axis(-1)
         .reset_index()
         .rename(columns=lambda x: x+1))
print (df)

         0         1        2
0   Andrew   running     cars
1     John    guitar  dancing
2  Michael  football      NaN

或聚集 list 并按构造函数创建新字典:

Or aggregate list and create new dictionary by constructor:

s = df1.groupby('a')['b'].agg(list)
df = pd.DataFrame(s.values.tolist(), index=s.index).reset_index()
print (df)
         a         0        1
0   Andrew   running     cars
1     John    guitar  dancing
2  Michael  football     None

这篇关于Groupby创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆