使用正则表达式将编号列表数组拆分为编号列表多行 [英] Use Regex to Split Numbered List array into Numbered List Multiline
问题描述
我正在尝试学习正则表达式以回答有关葡萄牙语的问题.
输入(单元格上的数组或字符串,所以是.MultiLine = False
)?
1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With number 0n mid. 4. Number 9 incorrect. 11.12 More than one digit. 12.7 Ending (no word).
输出
1 One without dot.
2. Some Random String.
3.1 With SubItens.
3.2 With number 0n mid.
4. Number 9 incorrect.
11.12 More than one digit.
12.7 Ending (no word).
我当时想使用 因此,请阅读 this 和此. RegExr网站与输入中的表达式 并获得以下信息: 是否有更好的方法可以做到这一点?正则表达式是正确的还是更好的生成方式?我在Google上找到的示例并没有使我了解如何正确使用RegEx和Split. 也许我对分割函数的逻辑感到困惑,我想获取分割索引,而分隔符字符串是正则表达式. 我可以使它以单词和句点结尾 使用 请参见 regex演示. 详细信息 这是示例VBA代码: 注意 您可能要求匹配项仅在单词+ I am trying to learn Regex to answer a question on SO portuguese. /([0-9]{1,2})([.]{0,1})([0-9]{0,2})/igm
一起使用.
\d+(?:\.\d+)*[\s\S]*?\w+\.
\d+
-1个或更多数字(?:\.\d+)*
-零个或多个序列:
\.
-点\d+
-1个或更多数字[\s\S]*?
-尽可能少的0个字符,直到第一个... \w+\.
-1个以上的字符字符,后跟.
.Dim str As String
Dim objMatches As Object
str = " 1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With Another SubItem. 4. List item. 11.12 More than one digit."
Set objRegExp = New regexp ' CreateObject("VBScript.RegExp")
objRegExp.Pattern = "\d+(?:\.\d+)*[\s\S]*?\w+\."
objRegExp.Global = True
Set objMatches = objRegExp.Execute(str)
If objMatches.Count <> 0 Then
For Each m In objMatches
Debug.Print m.Value
Next
End If
.
处停止,后跟0+空格和使用 \d+(?:\.\d+)*[\s\S]*?[a-zA-Z]+\.(?=\s*(?:\d+|$))
.(?=\s*(?:\d+|$))
正向超前查询要求在当前位置的右边紧随其后的是0+个空格(\s*
)和1+个数字(\d+
)或字符串末尾($
). /p>Input (Array or String on a Cell, so
.MultiLine = False
)?
1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With number 0n mid. 4. Number 9 incorrect. 11.12 More than one digit. 12.7 Ending (no word).
Output
1 One without dot.
2. Some Random String.
3.1 With SubItens.
3.2 With number 0n mid.
4. Number 9 incorrect.
11.12 More than one digit.
12.7 Ending (no word).
What i thought was to use Regex with Split, but i wasn't able to implement the example on Excel.
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "plum-pear"
Dim pattern As String = "(-)"
Dim substrings() As String = Regex.Split(input, pattern) ' Split on hyphens.
For Each match As String In substrings
Console.WriteLine("'{0}'", match)
Next
End Sub
End Module
' The method writes the following to the console:
' 'plum'
' '-'
' 'pear'
So reading this and this. The RegExr Website was used with the expression /([0-9]{1,2})([.]{0,1})([0-9]{0,2})/igm
on the Input.
And the following is obtained:
Is there a better way to make this? Is the Regex Correct or a better way to generate? The examples that i found on google didn't enlight me on how to use RegEx with Split correctly.
Maybe I am confusing with the logic of Split Function, which i wanted to get the split index and the separator string was the regex.
I can make that it ends with word and period
Use
\d+(?:\.\d+)*[\s\S]*?\w+\.
See the regex demo.
Details
\d+
- 1 or more digits(?:\.\d+)*
- zero or more sequences of:\.
- dot\d+
- 1 or more digits
[\s\S]*?
- any 0+ chars, as few as possible, up to the first...\w+\.
- 1+ word chars followed with.
.
Here is a sample VBA code:
Dim str As String
Dim objMatches As Object
str = " 1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With Another SubItem. 4. List item. 11.12 More than one digit."
Set objRegExp = New regexp ' CreateObject("VBScript.RegExp")
objRegExp.Pattern = "\d+(?:\.\d+)*[\s\S]*?\w+\."
objRegExp.Global = True
Set objMatches = objRegExp.Execute(str)
If objMatches.Count <> 0 Then
For Each m In objMatches
Debug.Print m.Value
Next
End If
NOTE
You may require the matches to only stop at the word + .
that are followed with 0+ whitespaces and a number using \d+(?:\.\d+)*[\s\S]*?[a-zA-Z]+\.(?=\s*(?:\d+|$))
.
The (?=\s*(?:\d+|$))
positive lookahead requires the presence of 0+ whitespaces (\s*
) followed with 1+ digits (\d+
) or end of string ($
) immediately to the right of the current location.
这篇关于使用正则表达式将编号列表数组拆分为编号列表多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!