用于解析CSV行的正则表达式 [英] Regular expressions to parse a CSV line
问题描述
亲爱的所有
是否有人使用正则表达式解析逗号分隔的行
某些字段可选地具有字符串分隔符(文本限定符)
我目前正在使用这个正则表达式进行测试,它几乎可以在我的所有测试用例中使用。我在互联网上用C#解决方案找到了这个。
,(?=([^ \"] *" [^] *")*(? ![^"] *"))
解释它。 />
我使用的VB.NET函数是
公共函数parseCSVLine(ByVal sInputString As String)As ArrayList
Dim r作为新正则表达式(,(?=([^ \& Chr(34)&'] *"& Chr(34)&" [^&
Chr(34)&"] *"& Chr(34)&")*(?![^"& Chr(34)&"] *" ;& Chr(34)&"))")
Dim iStart As Integer,m as Match
Dim oArrayList As New ArrayList()
每个m in r.Matches(sInputString)
oArrayList.Add(sInputString.Substring(iStart,m.Index - iStart))
iStart = m.Index + 1
下一页
oArrayList.Add(sInputString.Substring(iStart,sInpu) tString.Length -
iStart))
返回oArrayList
结束功能
我的测试用例如下:
#
CSV
价值1
价值2
价值3
价值4
结果
1
a,b,c
a
b
c
P
2
" a",b,c
a
b
c
P
3
'''',b,c
'''''
b
c
P
4
a,b,c
a
b
c
P
5
aa,bb; cc
aa
bb; cc
P
6
P
7
a
a
P
8
,b,
b
P
9
,, c
c / /
P
10
,,
P
11
",b
b
P
12
" ,b
[SPACE]
b
P
13
a,b
a,b
P
14
a,b,c
a,b
c
P
15
" a,b",c
a,b
c
P
16 br / >
ab,c
ab
c
P
17
ab,c
a" b
C
P
18
" a"" b",c
a" b
c
P
19
a"" b,c
a" b
c
P
20
a,b",c
a
b"
c
O
21
a,b",c
a
b"
c
P
22
a," B:"" Hi,I''B B"",
我是B>
c
P
23
a," b,c
a
" b
c
O
24
a,bc" d,e
a
bc" d
e
O
25
a,bc" d",e
a
bc" d"
e
O
26
a," bc" d,e
a
" bcd
$ b $
O
非常感谢,
Wazir
Dear All
Does anyone have a regular expression to parse a comma delimited line with
some fields optionally having string delimiters (text qualifiers)
I am currently testing with this regular expression and it works in almost
all my test cases. I found this on the internet in a C# solution.
,(?=([^\"]*"[^"]*")*(?![^"]*"))
However in some of my test cases it fails and I am having difficulty
interpreting it.
The VB.NET function I used is
Public Function parseCSVLine(ByVal sInputString As String) As ArrayList
Dim r As New Regex(",(?=([^\" & Chr(34) & "]*" & Chr(34) & "[^" &
Chr(34) & "]*" & Chr(34) & ")*(?![^" & Chr(34) & "]*" & Chr(34) & "))")
Dim iStart As Integer, m As Match
Dim oArrayList As New ArrayList()
For Each m In r.Matches(sInputString)
oArrayList.Add(sInputString.Substring(iStart, m.Index - iStart))
iStart = m.Index + 1
Next
oArrayList.Add(sInputString.Substring(iStart, sInputString.Length -
iStart))
Return oArrayList
End Function
My test cases are as follows:
#
CSV
Value 1
Value 2
Value 3
Value 4
Results
1
a,b,c
a
b
c
P
2
"a",b,c
a
b
c
P
3
''a'',b,c
''a''
b
c
P
4
a , b , c
a
b
c
P
5
aa,bb;cc
aa
bb;cc
P
6
P
7
a
a
P
8
,b,
b
P
9
,,c
c
P
10
,,
P
11
"",b
b
P
12
" ",b
[SPACE]
b
P
13
"a,b"
a,b
P
14
"a,b",c
a,b
c
P
15
" a , b ", c
a , b
c
P
16
a b,c
a b
c
P
17
a"b,c
a"b
C
P
18
"a""b",c
a"b
c
P
19
a""b,c
a""b
c
P
20
a,b",c
a
b"
c
O
21
a,b"",c
a
b""
c
P
22
a,"B: ""Hi, I''m B""",c
a
B: "Hi, I''m B"
c
P
23
a,"b,c
a
"b
c
O
24
a,bc"d,e
a
bc"d
e
O
25
a,bc"d",e
a
bc"d"
e
O
26
a,"bc"d,e
a
"bc"d
e
O
Many thanks,
Wazir
推荐答案
对于测试用例的格式化抱歉,我没有意识到我发布了
纯文本。
这里他们又来了,我希望这次更具可读性。
他们中的一些人喜欢案例20并不适用于正则表达式
#CSV值1值2值3值4
1 a,b,cabc
2" a",b,cabc
3''a'',b,c''a''b c
4 a,b,cabc
5 aa,bb; cc aa bb; cc
6
7 aa
8,b,b
9 ,, cc
10 ,,
11"" ;,bb
12" ,b [SPACE] b
13" a,b" a,b
14a,b,c a,b c
15" a,b",ca,bc
16 ab,cabc
17 a" b,c a" bc
18" a" ;b,c abc
19 a" b,c a" bc
20 a,b",ca b" c
21 a,b"",c b b"" c
22 a,B:"我是B"",&b;我是B
c
23 a," b,ca" bc
24 a,bc" d,ea bc" de
25a,bcd,ea bcd。 e />
26 a,bcd,eabcde
Apologies for the formatting of test cases, I didnt realise I was posting in
Plain Text.
Here they are again, I hope it is more readable this time.
Some of them like case 20 doesnt work with the regular expression
# CSV Value 1 Value 2 Value 3 Value 4
1 a,b,c a b c
2 "a",b,c a b c
3 ''a'',b,c ''a'' b c
4 a , b , c a b c
5 aa,bb;cc aa bb;cc
6
7 a a
8 ,b, b
9 ,,c c
10 ,,
11 "",b b
12 " ",b [SPACE] b
13 "a,b" a,b
14 "a,b",c a,b c
15 " a , b ", c a , b c
16 a b,c a b c
17 a"b,c a"b c
18 "a""b",c a"b c
19 a""b,c a""b c
20 a,b",c a b" c
21 a,b"",c a b"" c
22 a,"B: ""Hi, I''m B""",c a B: "Hi, I''m B"
c
23 a,"b,c a "b c
24 a,bc"d,e a bc"d e
25 a,bc"d",e a bc"d" e
26 a,"bc"d,e a "bc"d e
为测试用例格式化道歉,我没有意识到我发帖在
纯文本。
这里再次出现,我希望这次更具可读性。
其中一些像案例20不适用于正则表达式
#CSV值1值2值3值4
1 a,b, cabc
2" a",b,cabc
3''''',b,c''''bc
4 a,b,cabc
5 aa,bb; cc aa bb; cc
6
7 aa
8,b,b
9 ,, c c
10 ,,
11",b b
12" ,b [SPACE] b
13" a,b" a,b
14a,b,c a,b c
15" a,b",ca,bc
16 ab,cabc
17 a" b,c a" bc
18" a" ;b,c abc
19 a" b,c a" bc
20 a,b",ca b" c
21 a,b"",c b b"" c
22 a,B:"我是B"",&b;我是B
c
23 a," b,ca" bc
24 a,bc" d,ea bc" de
25a,bcd,ea bcd。 e />
26 a,bcd,ea" bc" de
Apologies for the formatting of test cases, I didnt realise I was posting in
Plain Text.
Here they are again, I hope it is more readable this time.
Some of them like case 20 doesnt work with the regular expression
# CSV Value 1 Value 2 Value 3 Value 4
1 a,b,c a b c
2 "a",b,c a b c
3 ''a'',b,c ''a'' b c
4 a , b , c a b c
5 aa,bb;cc aa bb;cc
6
7 a a
8 ,b, b
9 ,,c c
10 ,,
11 "",b b
12 " ",b [SPACE] b
13 "a,b" a,b
14 "a,b",c a,b c
15 " a , b ", c a , b c
16 a b,c a b c
17 a"b,c a"b c
18 "a""b",c a"b c
19 a""b,c a""b c
20 a,b",c a b" c
21 a,b"",c a b"" c
22 a,"B: ""Hi, I''m B""",c a B: "Hi, I''m B"
c
23 a,"b,c a "b c
24 a,bc"d,e a bc"d e
25 a,bc"d",e a bc"d" e
26 a,"bc"d,e a "bc"d e
您可以使用OleDbCommand类读取csv文件
Tu-Thac
www.ongtech.co
----- MW写道:----
亲爱的Al
有没有人有一个正则表达式来解析逗号分隔的线条
某些字段可选择有字符串分隔符(文本限定符
我目前正在测试这个正则表达式我在所有的测试用例中都可以在almos中工作。我在互联网上用C#解决方案找到了这个。
,(?=([^ \] ;] *" [^] *")*(?![^] *")
但是在我的一些测试用例中,它失败了,我我很难
解释它
我使用的VB.NET函数我
公共函数parseCSVLine(ByVal sInputString As String)A ■ArrayLis
Dim r作为新的正则表达式(&,?(?=([^ \" &安培; Chr(34)& "] *" &安培; Chr(34)& [^
Chr(34)& "] *" &安培; Chr(34)& ")*(?![^"& Chr(34)&"] *"& Chr(34)&"))"
Dim iStart As整数,m As Matc
Dim oArrayList As New ArrayList(
For each m in r.Matches(sInputString
oArrayList。添加(sInputString.Substring(iStart,m.Index - iStart)
iStart = m.Index +
Nex
oArrayList.Add(sInputString .Substring(iStart,sInputString.Length
iStart)
返回oArrayLis
结束功能
我的测试用例如下
CS
价值
价值
价值
价值
结果
a,b,
" ; a,b,
''''',b,
''a
a,b,
aa,bb; c
a
bb; c
,b
,,
1
,
1
" ;",
1
" ",
[SPACE
1
" a,b
a,
1
" a,b",
a,
1
a,b",
a,
1
ab,
a
1
a< b,
a"
1
a" b",
a"
1
a"" b,>
a""
2
a,b",
b
2
a,b","
b"
2
a," B:"" Hi,I''B B"","
B:我是B
2
a,b,
"
2
a,bc" d,
bc"
2
a,bc" d",
bc" d
2
a," bc" d,
" bc"
>
非常感谢
Wazi
You can use the OleDbCommand class to read a csv file
Tu-Thac
www.ongtech.co
----- MW wrote: ----
Dear Al
Does anyone have a regular expression to parse a comma delimited line wit
some fields optionally having string delimiters (text qualifiers
I am currently testing with this regular expression and it works in almos
all my test cases. I found this on the internet in a C# solution
,(?=([^\"]*"[^"]*")*(?![^"]*")
However in some of my test cases it fails and I am having difficult
interpreting it
The VB.NET function I used i
Public Function parseCSVLine(ByVal sInputString As String) As ArrayLis
Dim r As New Regex(",(?=([^\" & Chr(34) & "]*" & Chr(34) & "[^"
Chr(34) & "]*" & Chr(34) & ")*(?![^" & Chr(34) & "]*" & Chr(34) & "))"
Dim iStart As Integer, m As Matc
Dim oArrayList As New ArrayList(
For Each m In r.Matches(sInputString
oArrayList.Add(sInputString.Substring(iStart, m.Index - iStart)
iStart = m.Index +
Nex
oArrayList.Add(sInputString.Substring(iStart, sInputString.Length
iStart)
Return oArrayLis
End Functio
My test cases are as follows
CS
Value
Value
Value
Value
Result
a,b,
"a",b,
''a'',b,
''a
a , b ,
aa,bb;c
a
bb;c
,b
,,
1
,
1
"",
1
" ",
[SPACE
1
"a,b
a,
1
"a,b",
a,
1
" a , b ",
a ,
1
a b,
a
1
a"b,
a"
1
"a""b",
a"
1
a""b,
a""
2
a,b",
b
2
a,b"",
b"
2
a,"B: ""Hi, I''m B""",
B: "Hi, I''m B
2
a,"b,
"
2
a,bc"d,
bc"
2
a,bc"d",
bc"d
2
a,"bc"d,
"bc"
Many thanks
Wazi
这篇关于用于解析CSV行的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!