关于Java Split Command CSV文件解析 [英] Regarding Java Split Command CSV File Parsing

查看:116
本文介绍了关于Java Split Command CSV文件解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下格式的csv文件.如果程序读取了其中一个beow csv数据,我会遇到一个问题

I have a csv file in the below format. I get an issue if either one of the beow csv data is read by the program

"D",abc"def",","0429" 292"0","11","IJ80","Feb10_1.txt-2",文件记录","05/02/2010" ,"04/03/2010",","1",-91",","

"D",abc"def,"","0429"292"0","11","IJ80","Feb10_1.txt-2","FILE RECORD","05/02/2010","04/03/2010","","1","-91","",""

"D","abc" def,"," 04292920," 11," IJ80," Feb10_1.txt-2,"文件记录," 05/02/2010," 2010年4月3日,"," 1,"-91,",""

"D","abc"def","","04292920","11","IJ80","Feb10_1.txt-2","FILE RECORD","05/02/2010","04/03/2010","","1","-91","",""

下面的split命令用于忽略双引号内的逗号,这是我从较早的帖子中获得的以下split命令.粘贴我接受此命令的URL

The below split command is used to ignore the commas inside the double quotes i got the below split command from an earlier post. Pasted the URL that i took this command

字符串项目[] = line.split(,(?=([^ \"] \"[^ \"] \) [^ \"] $),15); System.out.println("items.length" + items.length);

String items[] = line.split(",(?=([^\"]\"[^\"]\")[^\"]$)",15); System.out.println("items.length"+items.length);

关于Java拆分命令解析Csv文件

items.length打印为14而不是15.abc"def无法识别为单个字段,并且被错误地存储为 项目[0]中的"D",abc" def.我希望将其以以下方式存储

The items.length is printed as 14 instead of 15. The abc"def is not recognized as a individual field and it's getting incorrectly stored as "D",abc"def in items[0]. . I want it to be stored in the below way

items [0]应该为"D",而item [1]应该为abc"def

items[0] should be "D" and items[1] should be abc"def

当存在值"abc" def"时也会发生相同的问题,我希望将其存储为

The same issue happens when there is a value "abc"def". I want it to be stored as

items [0]应该为"D",items [1]应该为"abc" def"

items[0] should be "D" and items[1] should be "abc"def"

如果在双引号内重复双引号(字段值为D,"abc""def",1),则此split命令也可以完美地工作.

Also this split command works perfectly if the double quotes repeated inside the double quotes( field value is D,"abc""def",1 ).

我该如何解决此问题.

推荐答案

我认为您最好编写一个解析器来解析CSV文件,而不是尝试使用正则表达式.一旦您开始处理带有回车符的CSV文件,则Regex可能会崩溃.编写一个遍历所有字符并拆分数据的简单while循环并不需要花费太多代码.当您拥有解析器而不是Regex时,处理非标准" * CSV文件(例如您的CSV文件)会容易得多.

I think you would be much better off writing a parser to parse the CSV files rather than try to use a regular expression. Once you start dealing with CSV files with carriage returns within the lines, then the Regex will probably fall apart. It wouldn't take that much code to write a simple while loop that went through all the characters and split up the data. It would be lot easier to deal with "Non-Standard"* CSV files such as yours when you have a parser rather than a Regex.

*我说的是非标准的,因为实际上并没有CSV的正式标准,并且当您处理来自许多不同系统的CSV文件时,您会看到很多奇怪的东西,例如abc"def字段,如图所示以上.

*I say non-standard because there isn't really an official standard for CSV, and when you're dealing with CSV files from many different systems, you see lots of weird things, like the abc"def field as shown above.

这篇关于关于Java Split Command CSV文件解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆