在制表符分隔文件行情 [英] Quotes in tab-delimited file

查看:202
本文介绍了在制表符分隔文件行情的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个打开制表符分隔文本文件的简单的应用程序,并插入数据到数据库中。

I've got a simple application that opens a tab-delimited text file, and inserts that data into a database.

我使用这个CSV读者读取数据: http://www.codeproject.com/KB/database/CsvReader.aspx

I'm using this CSV reader to read the data: http://www.codeproject.com/KB/database/CsvReader.aspx

和它所有的工作就好了!

And it is all working just fine!

现在我的客户增加了一个新的领域到该文件的结束,这是ClaimDescription,并且在一些这些权利要求的描述中,数据在其具有引号,例如:

Now my client has added a new field to the end of the file, which is "ClaimDescription", and in some of these claim descriptions, the data has quotes in it, example:

SUMISEI MARU NO 2 - 日本海

"SUMISEI MARU NO 2" - sea of Japan

这似乎导致我的应用程序的一大头疼。我得到一个异常,看起来是这样的:

This seems to be causing a major headache for my app. I get an exception which looks like this:

的CSV似乎接近记录在位置'181''1470'字段'26腐败。当前原始数据:...

The CSV appears to be corrupt near record '1470' field '26 at position '181'. Current raw data : ...

和在原始数据,果然声明说明字段显示在它的报价数据

And in that "raw data", sure enough the claim description field shows data with quotes in it.

我想知道是否有人之前曾经有过这个问题,并得到了圆呢?
很显然,我可以要求客户改变他们原本发送给我的数据,但是这是一个自动化的过程,他们用它来生成制表符分隔文件;我宁愿把它作为最后的手段。

I want to know if anyone has ever had this problem before, and got round it? Obviously I can ask the client to change the data they originally send to me, but this is an automated process that they use to generate the tab-delimited file; and I'd rather use that as a last resort.

我想我可以用前手标准的TextReader可能打开该文件,逃避任何报价,写的内容回到一个新的文件,然后养活文件导入CSV阅读器。这可能是值得一提的是,这些制表符分隔文件的平均文件大小约为40MB。

I was thinking I could maybe open the file using a standard TextReader before hand, escape any quotes, write the content back into a new file, then feed that file into the CSV Reader. It is probably worth mentioning that the average file size of these tab-delimited files is around 40MB.

任何帮助是非常感谢!

干杯,肖恩

推荐答案

右键 - 红牛和抓我的头的深夜之后,我终于找到这个问题,这是在Claim_Description字段中逗号。甚至没有考虑这个问题,因为我使用的是制表符分隔的文件,但是当我做了查找和对文件中的所有逗号代替它的工作绝对精品!

Right - after a late night of redbull and scratching my head, i eventually found the problem, it was commas in the "Claim_Description" field. Didn't even think about that because I was using a tab-delimited file, but as soon as i did a find and replace on all commas in the file it worked absolutely fine!

下一步是找出如何更换处理之前的逗号。

The next step is to find out how to replace those commas before processing.

再次感谢所有的建议。

干杯,肖恩

这篇关于在制表符分隔文件行情的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆