Pandas DataFrame:如何通过多值列将一行分成多行? [英] Pandas dataframe: how do I split one row into multiple rows by multi-value column?
本文介绍了Pandas DataFrame:如何通过多值列将一行分成多行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个如下所示的数据框:
I have a dataframe that appears as follows:
issue_key date pkey component case_count
0 1060 2018-03-08 PROJ console,configuration,management 8
1 1464 2018-04-24 PROJ2 protocol 1
2 611 2017-03-31 PROJ None 2
3 2057 2018-10-30 PROJ ha, console 0
我需要将组件列中具有多个值的行拆分为每个组件一行.
I need to split the rows with multiple values in the component column into one row per component.
完成后,数据框应如下所示:
When done, the dataframe should appear as follows:
issue_key date pkey component case_count
0 1060 2018-03-08 PROJ console 8
1 1060 2018-03-08 PROJ configuration 8
2 1060 2018-03-08 PROJ management 8
3 1464 2018-04-24 PROJ2 protocol 1
4 611 2017-03-31 PROJ None 2
5 2057 2018-10-30 PROJ ha 0
6 2057 2018-10-30 PROJ console 0
关于如何最好地做到这一点的任何建议?
Any suggestions on how best to do this?
推荐答案
假设dd
是您的数据帧.您可以这样做:
Let's say dd
is your data frame. You can do:
# convert to list
dd['component'] = dd['component'].str.split(',')
# convert list of pd.Series then stack it
dd = (dd
.set_index(['issue_key','date','pkey','case_count'])['component']
.apply(pd.Series)
.stack()
.reset_index()
.drop('level_4', axis=1)
.rename(columns={0:'component'}))
issue_key date pkey case_count component
0 1060 2018-03-08 PROJ 8 console
1 1060 2018-03-08 PROJ 8 configuration
2 1060 2018-03-08 PROJ 8 management
3 1464 2018-04-24 PROJ2 1 protocol
4 611 2017-03-31 PROJ 2 None
5 2057 2018-10-30 PROJ 0 ha
6 2057 2018-10-30 PROJ 0 console
这篇关于Pandas DataFrame:如何通过多值列将一行分成多行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文