以制表符分隔的文件引用 [英] Quotes in tab-delimited file

查看:388
本文介绍了以制表符分隔的文件引用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的应用程序打开制表符分隔的文本文件,并将该数据插入数据库。

I've got a simple application that opens a tab-delimited text file, and inserts that data into a database.

我使用此CSV阅读器阅读数据: http://www.codeproject.com/KB/database/CsvReader。 aspx

I'm using this CSV reader to read the data: http://www.codeproject.com/KB/database/CsvReader.aspx

这一切都很好!

现在我的客户添加了一个新的字段到文件的结尾,即ClaimDescription,并且在这些声明描述中,数据在其中有引号,例如:

Now my client has added a new field to the end of the file, which is "ClaimDescription", and in some of these claim descriptions, the data has quotes in it, example:


SUMISEI MARU NO 2 - 日本之海

"SUMISEI MARU NO 2" - sea of Japan

这似乎是我的应用程式的主要头痛。我得到如下的异常:

This seems to be causing a major headache for my app. I get an exception which looks like this:


CSV在位置'181'的记录'1470' 。当前原始数据:...

The CSV appears to be corrupt near record '1470' field '26 at position '181'. Current raw data : ...

在原始数据中,确定索赔说明字段中显示带引号的数据。

And in that "raw data", sure enough the claim description field shows data with quotes in it.

我想知道,如果有人曾经遇到过这个问题,并绕过它吗?
显然,我可以要求客户端更改他们最初发送给我的数据,但这是一个自动化过程,用于生成制表符分隔的文件;

I want to know if anyone has ever had this problem before, and got round it? Obviously I can ask the client to change the data they originally send to me, but this is an automated process that they use to generate the tab-delimited file; and I'd rather use that as a last resort.

我想我可以使用标准的TextReader打开文件,转义任何引号,写内容返回到新文件,然后将该文件提交到CSV阅读器。可能值得一提的是,这些制表符分隔文件的平均文件大小约为40MB。

I was thinking I could maybe open the file using a standard TextReader before hand, escape any quotes, write the content back into a new file, then feed that file into the CSV Reader. It is probably worth mentioning that the average file size of these tab-delimited files is around 40MB.

任何帮助都非常感谢!

Cheers,Sean

Cheers, Sean

推荐答案

右边 - 在redbull深夜,抓住我的头后,问题,在Claim_Description字段中是逗号。甚至没有想到,因为我使用一个制表符分隔的文件,但一旦我做了一个查找和替换文件中的所有逗号,它的工作非常好!

Right - after a late night of redbull and scratching my head, i eventually found the problem, it was commas in the "Claim_Description" field. Didn't even think about that because I was using a tab-delimited file, but as soon as i did a find and replace on all commas in the file it worked absolutely fine!

下一步是了解如何在处理之前替换这些逗号。

The next step is to find out how to replace those commas before processing.

再次感谢所有建议。

Cheers,Sean

Cheers, Sean

这篇关于以制表符分隔的文件引用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆