分类变量分为多列 [英] Categorical variables into multiple columns

查看:68
本文介绍了分类变量分为多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有带有分类变量Segment的数据框

I have dataframe with categorical variable Segment

ID  Segment Var
1   AAA     1
2   BBB     0
3   BBB     1
4   AAA     1
5   CCC     1
6   AAA     0 
7   AAA     1
8   AAA     0
9   BBB     0
10  CCC     0

我想将细分"列转换为如下3类:

And I would like to transform column Segment into 3 category like this:

ID  SegmentAAA  SegmentBBB  SegmentCCC
1   1           null        null
2   null        0           null
3   null        1           null
4   1           null        null
5   null        null        1
6   0           null        null
7   1           null        null
8   0           null        null
9   null        0           null
10  null        null        0

您能帮我吗?非常感谢.

Could you please help me with that. Thank you very much.

推荐答案

使用:

df.set_index(['ID','Segment'])['Var']\
  .unstack()\
  .add_prefix('Segment')\
  .rename_axis([None], axis=1)\
  .reset_index()

输出:

   ID  SegmentAAA  SegmentBBB  SegmentCCC
0   1         1.0         NaN         NaN
1   2         NaN         0.0         NaN
2   3         NaN         1.0         NaN
3   4         1.0         NaN         NaN
4   5         NaN         NaN         1.0
5   6         0.0         NaN         NaN
6   7         1.0         NaN         NaN
7   8         0.0         NaN         NaN
8   9         NaN         0.0         NaN
9  10         NaN         NaN         0.0

选项2:

pd.crosstab(df.ID,df.Segment,df.Var,aggfunc='first')

这篇关于分类变量分为多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆