在.NET中解析分隔的CSV [英] Parse Delimited CSV in .NET

查看:154
本文介绍了在.NET中解析分隔的CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个以逗号分隔格式的文本文件,在大多数字段上由分隔我试图将它转换为可枚举的东西



在这种情况下,字段是分开的用逗号和文本字段括在标记中。我遇到的问题是,一些字段中有引号(即8 Tray),并且偶然被拾取作为下一个字段。字段,他们没有引号在他们周围,但他们开始与+或 - 符号(描述正/负号)。



我在想一个RegEx,但我的技能不是那么伟大,所以希望有人能提出一些想法我可以尝试。在这个文件中有大约19000条记录,所以我尽量做到尽可能高效,这里有几个示例数据行:

 00,000000112260,Pie Pumpkin,RET,6.99 ,ea,+ 0000000006.99000 
00,000000304078,Pie Apple caramel,RET,9.99,,ea,+ 0000000009.99000
,StringValue here,8Tray of Food,RET,6.99,,ea,-00000000005.3200


有很多字段,但你可以得到图片....



我使用VB.NET和我有一个通用的列表设置来接受数据。我尝试使用 CSVReader ,它似乎工作得很好,直到你打一个记录像第三个(在文本字段中有引号)。如果我能以某种方式得到它处理额外的报价,比CSVReader选项将工作伟大。



谢谢!

解决方案

//www.developmentnow.com/blog/Parse+CSV+Files+In+C+And+ASPNET.aspxrel =nofollow noreferrer>此处:

 编码fileEncoding = GetFileEncoding(csvFile); 
//除去用作字段分隔符的所有双引号
string fileContents = File.ReadAllText(csvFile,fileEncoding);
string fixedContents = Regex.Replace(fileContents,@([^ \ ^,\r\\\
])([^ $,\r\\\
]),@ $ 1 $ 2);
using(CsvReader csv =
new CsvReader(new StringReader(fixedContents),true))
{
// ...解析CSV


I have a text file that is in a comma separated format, delimited by " on most fields. I am trying to get that into something I can enumerate through (Generic Collection, for example). I don't have control over how the file is output nor the character it uses for the delimiter.

In this case, the fields are separated by a comma and text fields are enclosed in " marks. The problem I am running into is that some fields have quotation marks in them (i.e. 8" Tray) and are accidentally being picked up as the next field. In the case of numeric fields, they don't have quotes around them, but they do start with a + or a - sign (depicting a positive/negative number).

I was thinking of a RegEx, but my skills aren't that great so hopefully someone can come up with some ideas I can try. There are about 19,000 records in this file, so I am trying to do it as efficiently as possible. Here are a couple of example rows of data:

"00","000000112260   ","Pie Pumpkin                             ","RET","6.99 ","     ","ea ",+0000000006.99000
"00","000000304078   ","Pie Apple caramel                       ","RET","9.99 ","     ","ea ",+0000000009.99000
"00","StringValue here","8" Tray of Food                             ","RET","6.99 ","     ","ea ",-00000000005.3200

There are a lot more fields, but you can get the picture....

I am using VB.NET and I have a generic List setup to accept the data. I have tried using CSVReader and it seems to work well until you hit a record like the 3rd one (with a quote in the text field). If I could somehow get it to handle the additional quotes, than the CSVReader option will work great.

Thanks!

解决方案

From here:

Encoding fileEncoding = GetFileEncoding(csvFile);
// get rid of all doublequotes except those used as field delimiters
string fileContents = File.ReadAllText(csvFile, fileEncoding);
string fixedContents = Regex.Replace(fileContents, @"([^\^,\r\n])""([^$,\r\n])", @"$1$2");
using (CsvReader csv =
       new CsvReader(new StringReader(fixedContents), true))
{
       // ... parse the CSV

这篇关于在.NET中解析分隔的CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆