将哑列添加到原始数据帧 [英] adding dummy columns to the original dataframe
问题描述
我有一个看起来像这样的数据框:
I have a dataframe looks like this:
JOINED_CO GENDER EXEC_FULLNAME GVKEY YEAR CONAME BECAMECEO REJOIN LEFTOFC LEFTCO RELEFT REASON PAGE
CO_PER_ROL
5622 NaN MALE Ira A. Eichner 1004 1992 AAR CORP 19550101 NaN 19961001 19990531 NaN RESIGNED 79
5622 NaN MALE Ira A. Eichner 1004 1993 AAR CORP 19550101 NaN 19961001 19990531 NaN RESIGNED 79
5622 NaN MALE Ira A. Eichner 1004 1994 AAR CORP 19550101 NaN 19961001 19990531 NaN RESIGNED 79
5622 NaN MALE Ira A. Eichner 1004 1995 AAR CORP 19550101 NaN 19961001 19990531 NaN RESIGNED 79
5622 NaN MALE Ira A. Eichner 1004 1996 AAR CORP 19550101 NaN 19961001 19990531 NaN RESIGNED 79
5622 NaN MALE Ira A. Eichner 1004 1997 AAR CORP 19550101 NaN 19961001 19990531 NaN RESIGNED 79
5622 NaN MALE Ira A. Eichner 1004 1998 AAR CORP 19550101 NaN 19961001 19990531 NaN RESIGNED 79
5623 NaN MALE David P. Storch 1004 1992 AAR CORP 19961009 NaN NaN NaN NaN NaN 57
5623 NaN MALE David P. Storch 1004 1993 AAR CORP 19961009 NaN NaN NaN NaN NaN 57
5623 NaN MALE David P. Storch 1004 1994 AAR CORP 19961009 NaN NaN NaN NaN NaN 57
5623 NaN MALE David P. Storch 1004 1995 AAR CORP 19961009 NaN NaN NaN NaN NaN 57
5623 NaN MALE David P. Storch 1004 1996 AAR CORP 19961009 NaN NaN NaN NaN NaN 57
对于YEAR值,我想将年列(1993,1994 ...,2009)添加到原始数据框中.如果YEAR中的值是1992,则1992列中的值应为1,否则为0.
For the YEAR value, I like to add year columns (1993,1994...,2009) to the original dataframe, If the value in YEAR is 1992, then the value in the 1992 column should be 1 otherwise 0.
我使用了一个非常愚蠢的for循环,但是由于我有一个庞大的数据集,它似乎可以永远运行. 谁能帮助我,非常感谢!
I used a very stupid for loop, but it seems to run forever as I have a large dataset. Could anyone help me with it, thanks a lot!
推荐答案
In [77]: df = pd.concat([df, pd.get_dummies(df['YEAR'])], axis=1); df
Out[77]:
JOINED_CO GENDER EXEC_FULLNAME GVKEY YEAR CONAME BECAMECEO \
5622 NaN MALE Ira A. Eichner 1004 1992 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1993 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1994 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1995 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1996 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1997 AAR CORP 19550101
5622 NaN MALE Ira A. Eichner 1004 1998 AAR CORP 19550101
5623 NaN MALE David P. Storch 1004 1992 AAR CORP 19961009
5623 NaN MALE David P. Storch 1004 1993 AAR CORP 19961009
5623 NaN MALE David P. Storch 1004 1994 AAR CORP 19961009
5623 NaN MALE David P. Storch 1004 1995 AAR CORP 19961009
5623 NaN MALE David P. Storch 1004 1996 AAR CORP 19961009
REJOIN LEFTOFC LEFTCO RELEFT REASON PAGE 1992 1993 1994 \
5622 NaN 19961001 19990531 NaN RESIGNED 79 1 0 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 1 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 1
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 0
5622 NaN 19961001 19990531 NaN RESIGNED 79 0 0 0
5623 NaN NaN NaN NaN NaN 57 1 0 0
5623 NaN NaN NaN NaN NaN 57 0 1 0
5623 NaN NaN NaN NaN NaN 57 0 0 1
5623 NaN NaN NaN NaN NaN 57 0 0 0
5623 NaN NaN NaN NaN NaN 57 0 0 0
1995 1996 1997 1998
5622 0 0 0 0
5622 0 0 0 0
5622 0 0 0 0
5622 1 0 0 0
5622 0 1 0 0
5622 0 0 1 0
5622 0 0 0 1
5623 0 0 0 0
5623 0 0 0 0
5623 0 0 0 0
5623 1 0 0 0
5623 0 1 0 0
如果您想删除YEAR
列,则可以通过del df['YEAR']
进行后续操作.或者,在调用concat
之前从df
中删除YEAR
列:
If you'd like to delete the YEAR
column, then you could follow this up with del df['YEAR']
. Or, drop the YEAR
column from df
before calling concat
:
df = pd.concat([df.drop('YEAR', axis=1), pd.get_dummies(df['YEAR'])], axis=1)
这篇关于将哑列添加到原始数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!