Pandas Dataframe:将列拆分为多列,右对齐不一致的单元格条目 [英] Pandas Dataframe: split column into multiple columns, right-align inconsistent cell entries

查看:368
本文介绍了Pandas Dataframe:将列拆分为多列,右对齐不一致的单元格条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个熊猫数据框,其中有一列名为城市,州,国家/地区".我想将此列分为三个新列:城市",州"和国家".

I have a pandas dataframe with a column named 'City, State, Country'. I want to separate this column into three new columns, 'City, 'State' and 'Country'.

0                 HUN
1                 ESP
2                 GBR
3                 ESP
4                 FRA
5             ID, USA
6             GA, USA
7    Hoboken, NJ, USA
8             NJ, USA
9                 AUS

将列分为三列就足够了:

Splitting the column into three columns is trivial enough:

location_df = df['City, State, Country'].apply(lambda x: pd.Series(x.split(',')))

但是,这会创建左对齐数据:

However, this creates left-aligned data:

     0       1       2
0    HUN     NaN     NaN
1    ESP     NaN     NaN
2    GBR     NaN     NaN
3    ESP     NaN     NaN
4    FRA     NaN     NaN
5    ID      USA     NaN
6    GA      USA     NaN
7    Hoboken  NJ     USA
8    NJ      USA     NaN
9    AUS     NaN     NaN

如何将数据右对齐来创建新列?我是否需要遍历每一行,计算逗号的数量并分别处理内容?

How would one go about creating the new columns with the data right-aligned? Would I need to iterate through every row, count the number of commas and handle the contents individually?

推荐答案

我将执行以下操作:

foo = lambda x: pd.Series([i for i in reversed(x.split(','))])
rev = df['City, State, Country'].apply(foo)
print rev

      0    1        2
0   HUN  NaN      NaN
1   ESP  NaN      NaN
2   GBR  NaN      NaN
3   ESP  NaN      NaN
4   FRA  NaN      NaN
5   USA   ID      NaN
6   USA   GA      NaN
7   USA   NJ  Hoboken
8   USA   NJ      NaN
9   AUS  NaN      NaN

我认为这可以为您提供所需的东西,但是如果您还想对东西进行装饰并获得城市,州,国家"列的顺序,则可以添加以下内容:

I think that gets you what you want but if you also want to pretty things up and get a City, State, Country column order, you could add the following:

rev.rename(columns={0:'Country',1:'State',2:'City'},inplace=True)
rev = rev[['City','State','Country']]
print rev

     City State Country
0      NaN   NaN     HUN
1      NaN   NaN     ESP
2      NaN   NaN     GBR
3      NaN   NaN     ESP
4      NaN   NaN     FRA
5      NaN    ID     USA
6      NaN    GA     USA
7  Hoboken    NJ     USA
8      NaN    NJ     USA
9      NaN   NaN     AUS

这篇关于Pandas Dataframe:将列拆分为多列,右对齐不一致的单元格条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆