如何在 pandas 中将一列分为三列 [英] How to split a column into three columns in pandas

查看:69
本文介绍了如何在 pandas 中将一列分为三列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的数据框

I have a data frame as shown below

ID  Name     Address
1   Kohli    Country: India; State: Delhi; Sector: SE25
2   Sachin   Country: India; State: Mumbai; Sector: SE39
3   Ponting  Country: Australia; State: Tasmania 
4   Ponting  State: Tasmania; Sector: SE27

从上面我想在下面的数据框里准备

From the above I would like to prepare below data frame

ID  Name     Country   State     Sector
1   Kohli    India     Delhi     SE25
2   Sachin   India     Mumbai    SE39
3   Ponting  Australia Tasmania  None
4   Ponting  None      Tasmania  SE27

我尝试了以下代码

df[['Country', 'State', 'Sector']] = pd.DataFrame(df['ADDRESS'].str.split(';',2).tolist(),
                                   columns = ['Country', 'State', 'Sector'])

但是从上面再次,我必须通过对列进行切片来清理数据.我想知道有没有比这更简单的方法.

But from the above again I have to clean the data by slicing the column. I would like to know is there any easy method than this.

推荐答案

将列表理解和dict理解用于字典列表,并传递给DataFrame构造函数:

Use list comprehension with dict comprehension for list of dictionaries and pass to DataFrame constructor:

L = [{k:v for y in x.split('; ')  for k, v in dict([y.split(': ')]).items()} 
          for x in df.pop('Address')]

df = df.join(pd.DataFrame(L, index=df.index))
print (df)
   ID     Name    Country     State Sector
0   1    Kohli      India     Delhi   SE25
1   2   Sachin      India    Mumbai   SE39
2   3  Ponting  Australia  Tasmania    NaN

或将split与重塑stack一起使用:

df1 = (df.pop('Address')
         .str.split('; ', expand=True)
         .stack()
         .reset_index(level=1, drop=True)
         .str.split(': ', expand=True)
         .set_index(0, append=True)[1]
         .unstack()
         )
print (df1)
0    Country Sector     State
0      India   SE25     Delhi
1      India   SE39    Mumbai
2  Australia    NaN  Tasmania

df = df.join(df1)
print (df)
   ID     Name    Country Sector     State
0   1    Kohli      India   SE25     Delhi
1   2   Sachin      India   SE39    Mumbai
2   3  Ponting  Australia    NaN  Tasmania

这篇关于如何在 pandas 中将一列分为三列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆