Groupby创建新列 [英] Groupby to create new columns
本文介绍了Groupby创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如果已经找到索引,我想从一个数据框创建一个带有新列的数据框,但是我不知道我将创建多少列:
From a dataframe, I want to create a dataframe with new columns if the index is already found BUT I don't know how many columns I will create :
pd.DataFrame([["John","guitar"],["Michael","football"],["Andrew","running"],["John","dancing"],["Andrew","cars"]])
我想要:
pd.DataFrame([["John","guitar","dancing"],["Michael","Football",None],["Andrew","running","cars"]])
不知道我一开始应该创建多少列.
without knowing how many columns I should create at the start.
推荐答案
使用 unstack
:
Use GroupBy.cumcount
for get counter
and then reshape by unstack
:
df1 = pd.DataFrame([["John","guitar"],
["Michael","football"],
["Andrew","running"],
["John","dancing"],
["Andrew","cars"]], columns=['a','b'])
a b
0 John guitar
1 Michael football
2 Andrew running
3 John dancing
4 Andrew cars
df = (df1.set_index(['a', df1.groupby('a').cumcount()])['b']
.unstack()
.rename_axis(-1)
.reset_index()
.rename(columns=lambda x: x+1))
print (df)
0 1 2
0 Andrew running cars
1 John guitar dancing
2 Michael football NaN
或聚集 list
并按构造函数创建新字典:
Or aggregate list
and create new dictionary by constructor:
s = df1.groupby('a')['b'].agg(list)
df = pd.DataFrame(s.values.tolist(), index=s.index).reset_index()
print (df)
a 0 1
0 Andrew running cars
1 John guitar dancing
2 Michael football None
这篇关于Groupby创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文