Ruby:如何使用“逗号”处理CSV文件? [英] Ruby: How can I process a CSV file with "bad commas"?

查看:236
本文介绍了Ruby:如何使用“逗号”处理CSV文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从FedEx.com处理含有运费记录的CSV档案。不幸的是,FedEx似乎没有实际测试它的CSV文件,因为它不引用在其中有逗号的字符串。

I need to process a CSV file from FedEx.com containing shipping history. Unfortunately FedEx doesn't seem to actually test its CSV files as it doesn't quote strings that have commas in them.

例如,公司名称可能是狗Widgets,Inc.但CSV不引用该字符串,因此任何CSV解析器认为在Inc.之前的逗号是一个新字段的开始。

For instance, a company name might be "Dog Widgets, Inc." but the CSV doesn't quote that string, so any CSV parser thinks that comma before "Inc." is the start of a new field.

有什么方法可以使用Ruby可靠地解析这些行吗?

Is there any way I can reliably parse those rows using Ruby?

我可以找到的唯一区别特性是,作为字符串一部分的逗号后面有一个空格。分隔字段的逗号没有空格。

The only differentiating characteristic that I can find is that the commas that are part of a string have a space after then. Commas that separate fields have no spaces. No clue how that helps me parse this, but it is something I noticed.

推荐答案

这里有个想法:你可以替换每个具有唯一字符的逗号后跟逗号的实例,然后照常解析CSV,然后通过生成的行并反转替换。

Well, here's an idea: You could replace each instance of comma-followed-by-a-space with a unique character, then parse the CSV as usual, then go through the resulting rows and reverse the replace.

这篇关于Ruby:如何使用“逗号”处理CSV文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆