如何使用带有逗号分隔符和空格的 Pandas 解析 csv? [英] How do I parse a csv with pandas that has a comma delimiter and space?

查看:68
本文介绍了如何使用带有逗号分隔符和空格的 Pandas 解析 csv?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前有以下带有逗号分隔符的 data.csv:

I currently have the following data.csv which has a comma delimiter:

name,day
Chicken Sandwich,Wednesday
Pesto Pasta,Thursday
Lettuce, Tomato & Onion Sandwich,Friday
Lettuce, Tomato & Onion Pita,Friday
Soup,Saturday

解析器脚本是:

import pandas as pd


df = pd.read_csv('data.csv', delimiter=',', error_bad_lines=False, index_col=False)
print(df.head(5))

输出为:

Skipping line 4: expected 2 fields, saw 3
Skipping line 5: expected 2 fields, saw 3

               name        day
0  Chicken Sandwich  Wednesday
1       Pesto Pasta   Thursday
2              Soup   Saturday

我如何处理这种情况 Lettuce, Tomato &洋葱三明治.每个项目都应该用 , 分隔,但一个项目中可能有一个逗号,后跟一个空格.所需的输出是:

How do I handle the case Lettuce, Tomato & Onion Sandwich. Each item should be separated by , but it's possible that an item has a comma in it followed by a space. The desired output is:

                               name        day
0                  Chicken Sandwich  Wednesday
1                       Pesto Pasta   Thursday
2  Lettuce, Tomato & Onion Sandwich     Friday
3      Lettuce, Tomato & Onion Pita     Friday
4                              Soup   Saturday

推荐答案

在其他情况下也适用的替代方案.好吧,太丑了.

An alternative that works in other situations too. OK, it's ugly.

import pandas as pd
from io import StringIO

for_pd = StringIO()
with open('theirry.csv') as input:
    for line in input:
        line = line.rstrip().replace(', ', '|||').replace(',', '```').replace('|||', ', ').replace('```', '|')
        print (line, file=for_pd)
for_pd.seek(0)

df = pd.read_csv(for_pd, sep='|')

print (df)

结果:

                               name        day
0                  Chicken Sandwich  Wednesday
1                       Pesto Pasta   Thursday
2  Lettuce, Tomato & Onion Sandwich     Friday
3      Lettuce, Tomato & Onion Pita     Friday
4                              Soup   Saturday

这篇关于如何使用带有逗号分隔符和空格的 Pandas 解析 csv?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆