如何在 pandas 中将一列分为三列 [英] How to split a column into three columns in pandas
本文介绍了如何在 pandas 中将一列分为三列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个如下所示的数据框
I have a data frame as shown below
ID Name Address
1 Kohli Country: India; State: Delhi; Sector: SE25
2 Sachin Country: India; State: Mumbai; Sector: SE39
3 Ponting Country: Australia; State: Tasmania
4 Ponting State: Tasmania; Sector: SE27
从上面我想在下面的数据框里准备
From the above I would like to prepare below data frame
ID Name Country State Sector
1 Kohli India Delhi SE25
2 Sachin India Mumbai SE39
3 Ponting Australia Tasmania None
4 Ponting None Tasmania SE27
我尝试了以下代码
df[['Country', 'State', 'Sector']] = pd.DataFrame(df['ADDRESS'].str.split(';',2).tolist(),
columns = ['Country', 'State', 'Sector'])
但是从上面再次,我必须通过对列进行切片来清理数据.我想知道有没有比这更简单的方法.
But from the above again I have to clean the data by slicing the column. I would like to know is there any easy method than this.
推荐答案
将列表理解和dict理解用于字典列表,并传递给DataFrame
构造函数:
Use list comprehension with dict comprehension for list of dictionaries and pass to DataFrame
constructor:
L = [{k:v for y in x.split('; ') for k, v in dict([y.split(': ')]).items()}
for x in df.pop('Address')]
df = df.join(pd.DataFrame(L, index=df.index))
print (df)
ID Name Country State Sector
0 1 Kohli India Delhi SE25
1 2 Sachin India Mumbai SE39
2 3 Ponting Australia Tasmania NaN
或将split
与重塑stack
一起使用:
df1 = (df.pop('Address')
.str.split('; ', expand=True)
.stack()
.reset_index(level=1, drop=True)
.str.split(': ', expand=True)
.set_index(0, append=True)[1]
.unstack()
)
print (df1)
0 Country Sector State
0 India SE25 Delhi
1 India SE39 Mumbai
2 Australia NaN Tasmania
df = df.join(df1)
print (df)
ID Name Country Sector State
0 1 Kohli India SE25 Delhi
1 2 Sachin India SE39 Mumbai
2 3 Ponting Australia NaN Tasmania
这篇关于如何在 pandas 中将一列分为三列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文