解析包含带引号的逗号和换行符的逗号分隔值 [英] Parsing comma-separated values containing quoted commas and newlines

查看:163
本文介绍了解析包含带引号的逗号和换行符的逗号分隔值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些特殊字符的字符串。
目的是检索每行的String [](,分隔)
你有特殊字符你可以有/ n和

 例如主字符串
Alpha,Beta,Gama,23-5-2013,TOM,TOTO,Julie,KameL
Titi timmy,tomy,tony,
tini。

您可以看到/中有您/ n。



任何帮助我解析这个。



感谢



/ p>

我需要分隔这些

  b $ b Beta 
Gama
23-5-2013,TOM
TOTO
Julie,KameL,Titi
God
timmy,tomy,tony,tini

问题是:对于Julie,KameL,Titi有换行符/ n或
KameL和Titi
timmy,tomy,tony,tini的类似问题在tony和tini之间有换行符/ n或






新文本(必须逐行阅读)

  Alpha,Beta Charli ,Delta,Delta Echo,Frank George,Henry 
1234-5,Ida,John
,25/11/1964,15/12/1964,40,000,000.00,0.0975,2,King ,Lincoln
,Mary / New York,123456
12543-01,Ocean,Peter

输出我要删除此

  Alpha 
Beta Charli
Delta
Delta Echo
Frank George
Henry
1234-5
Ida
John

25/11/1964
15/12/1964
40,000,000.00
0.0975
2
King
Lincoln

Mary / New York
123456
12543-01
Ocean
Peter


解决方案>

试试这个:

  String source =Alpha,Beta,Gama,\ TOM \,TOTO,\Julie,KameL \\\

+Titi \,God,\timmy,tomy,tony,\\\

+tini \。

模式p = Pattern.compile((([^ \] [^,] *)| \([^ \] *)\),? );
Matcher m = p.matcher(source);

while(m.find())
{
if(m.group(2)!= null)
System.out.println(m.group (2).replace(\\\
,));
else if(m.group(3)!= null)
System.out.println(m.group(3).replace(\\\
,)
}

如果它匹配一个不带引号的字符串,结果将返回组2。
在第3组中返回带引号的字符串。因此我需要在while块中进行区分。

输出:

Alpha

Beta

Gama

23-5-2013,TOM

TOTO

Julie,KameLTiti

上帝< br>
timmy,tomy,tony,tini


I have string with some special characters. The aim is to retrieve String[] of each line (, separated) You have special character " where you can have /n and ,

For example Main String
Alpha,Beta,Gama,"23-5-2013,TOM",TOTO,"Julie, KameL
Titi",God," timmy, tomy,tony,
tini".

You can see that there are you /n in "".

Can any Help me to Parse this.

Thanks

__ More Explanation

with the Main Sting I need to separate these

Here Alpha
Beta
Gama
23-5-2013,TOM
TOTO
Julie,KameL,Titi
God
timmy, tomy,tony,tini

Problem is : for Julie,KameL,Titi there is line break /n or
in between KameL and Titi similar problem for timmy, tomy,tony,tini there is line break /n or
in between tony and tini.


new this text is in file (compulsory line by line reading)

Alpha,Beta Charli,Delta,Delta Echo ,Frank George,Henry
1234-5,"Ida, John
 ", 25/11/1964, 15/12/1964,"40,000,000.00",0.0975,2,"King, Lincoln 
 ",Mary / New York,123456
12543-01,"Ocean, Peter

output i want to remove this "

Alpha
Beta Charli
Delta
Delta Echo
Frank George
Henry
1234-5
Ida
John
"
25/11/1964
15/12/1964
40,000,000.00
0.0975
2
King
Lincoln
"
Mary / New York
123456
12543-01
Ocean
Peter

解决方案

Try this:

String source = "Alpha,Beta,Gama,\"23-5-2013,TOM\",TOTO,\"Julie, KameL\n"
              + "Titi\",God,\" timmy, tomy,tony,\n"
              + "tini\".";

Pattern p = Pattern.compile("(([^\"][^,]*)|\"([^\"]*)\"),?");
Matcher m = p.matcher(source);

while(m.find())
{
    if(m.group(2) != null)
        System.out.println( m.group(2).replace("\n", "") );
    else if(m.group(3) != null)
        System.out.println( m.group(3).replace("\n", "") );
}

If it matches a string without quotes, the result is returned in group 2. Strings with quotes are returned in group 3. Hence i needed a distinction in the while-block. You might find a prettier way.

Output:
Alpha
Beta
Gama
23-5-2013,TOM
TOTO
Julie, KameLTiti
God
timmy, tomy,tony,tini
.

这篇关于解析包含带引号的逗号和换行符的逗号分隔值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆