根据分隔符字符串将列拆分为单独的列 [英] Split column into separate columns based on separator strings

查看:115
本文介绍了根据分隔符字符串将列拆分为单独的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,我们有一个csv文件,

name age address john 25 koramangala banglore #@ sales maneger %$ india harshuth rao 36 belandur banglore #@ maneger %$ india vijay kumar 45 ulsoor banglore #@ sales maneger %$ india suhas 25 koramangala banglore #@analist %$ india mithun 22 venkatapura banglore #@ execitive %$ india

name age address john 25 koramangala banglore #@ sales maneger %$ india harshuth rao 36 belandur banglore #@ maneger %$ india vijay kumar 45 ulsoor banglore #@ sales maneger %$ india suhas 25 koramangala banglore #@analist %$ india mithun 22 venkatapura banglore #@ execitive %$ india

如何制作并添加到其他列

name           age  city                  country     position 
john           25   koramangala banglore  india       sales maneger
harshuth rao   36   belandur banglore     india       maneger
vijay kumar    45   ulsoor banglore       india       sales maneger
suhas          25   koramangala banglore  india       analist
mithun         22   venkatapura banglore  india       execitive

我正在使用的代码是

 import re
 import csv
 with open("/home/vipul/Desktop/example.csv", 'rb') as f:
    mycsv = csv.reader(f)
    for row in mycsv:
        text = row[0]
        txt = re.findall(r'(\w+[\s\w]*)\b', text)  
        print txt

这是在txt编辑器中的外观

name ,age ,address
john,25,koramangala banglore +ACMAQA- sales maneger +ACUAJA- india
harshuth rao ,36,belandur banglore +ACMAQA-  maneger +ACUAJA- india 
vijay kumar,45,ulsoor banglore +ACMAQA- sales maneger +ACUAJA- india
suhas,25,koramangala banglore +ACMAQA-analist +ACUAJA- india
mithun,22,venkatapura banglore +ACMAQA- execitive +ACUAJA- india

推荐答案

read_csv

import io
t = """name ,age , address
john,25,koramangala banglore +ACMAQA- sales maneger +ACUAJA- india
harshuth rao ,36,belandur banglore +ACMAQA-  maneger +ACUAJA- india 
vijay kumar,45,ulsoor banglore +ACMAQA- sales maneger +ACUAJA- india
suhas,25,koramangala banglore +ACMAQA-analist +ACUAJA- india
mithun,22,venkatapura banglore +ACMAQA- execitive +ACUAJA- india"""

df = pd.read_csv(io.StringIO(t), 
                 sep='\s*\+ACMAQA-\s*|\s*\+ACUAJA-\s*|\s*,\s*', engine='python')
df = df.reset_index()
df.columns = ["name", "age", "city", "position", "country"]


    name          age                   city    position      country
0   john           25   koramangala banglore    sales maneger   india
1   harshuth rao   36   belandur banglore       maneger         india
2   vijay kumar    45   ulsoor banglore sales   maneger         india
3   suhas          25   koramangala banglore    analist         india
4   mithun         22   venkatapura banglore    execitive       india

这篇关于根据分隔符字符串将列拆分为单独的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆