分类变量分为多列 [英] Categorical variables into multiple columns
本文介绍了分类变量分为多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有带有分类变量Segment的数据框
I have dataframe with categorical variable Segment
ID Segment Var
1 AAA 1
2 BBB 0
3 BBB 1
4 AAA 1
5 CCC 1
6 AAA 0
7 AAA 1
8 AAA 0
9 BBB 0
10 CCC 0
我想将细分"列转换为如下3类:
And I would like to transform column Segment into 3 category like this:
ID SegmentAAA SegmentBBB SegmentCCC
1 1 null null
2 null 0 null
3 null 1 null
4 1 null null
5 null null 1
6 0 null null
7 1 null null
8 0 null null
9 null 0 null
10 null null 0
您能帮我吗?非常感谢.
Could you please help me with that. Thank you very much.
推荐答案
使用:
df.set_index(['ID','Segment'])['Var']\
.unstack()\
.add_prefix('Segment')\
.rename_axis([None], axis=1)\
.reset_index()
输出:
ID SegmentAAA SegmentBBB SegmentCCC
0 1 1.0 NaN NaN
1 2 NaN 0.0 NaN
2 3 NaN 1.0 NaN
3 4 1.0 NaN NaN
4 5 NaN NaN 1.0
5 6 0.0 NaN NaN
6 7 1.0 NaN NaN
7 8 0.0 NaN NaN
8 9 NaN 0.0 NaN
9 10 NaN NaN 0.0
选项2:
pd.crosstab(df.ID,df.Segment,df.Var,aggfunc='first')
这篇关于分类变量分为多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文