如何在Microsoft Excel中使用正则表达式(正则表达式)in-cell和循环 [英] How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
问题描述
- 单元格内的功能返回匹配的模式或替换字符串中的值。
- Sub循环遍历一列数据,并将匹配提取到相邻的单元格。
- 必要?
- 正则表达式的Excel特殊字符是什么?
我理解正则表达式对于许多情况来说不是理想的(使用或不使用正则表达式?)因为excel可以使用 Left
, Mid
, Right
, Instr
类似的操作类型命令。
正则表达式用于模式匹配。
要在Excel中使用,请按照以下步骤操作:
步骤1 :添加VBA引用Microsoft VBScript正则表达式5.5
- 选择开发人员选项卡(我没有这个标签我该怎么办?)
- 从代码中选择Visual Basic图标功能区
- 在Microsoft Visual Basic for Applications窗口中,从顶部菜单中选择工具。
- 选择参考
- 选中Microsoft VBScript Regular Expressions 5.5旁边的框以包含在您的工作簿中。
- 单击确定
步骤2 :定义您的模式
基本定义: em>
-
范围。
- 例如。
a-z
匹配从a到z的小写字母 - 例如。
0-5
匹配从0到5的任何数字
[]
匹配这些方括号内的一个对象。
- 例如
[a]
匹配字母a - 例如。
[abc]
匹配单字母,可以是a,b或c - 例如。
()
组合不同的匹配返回目的。
{}
在其之前定义的模式的重复副本的乘数。
- 例如
[a] {2}
连续两个小写字母a:aa
- EG
[a] {1,3}
匹配至少一个最多三个小写字母a
,aa
,aaa
+
匹配在它之前定义的模式中至少一个或多个。
- 例如
a +
将匹配连续的a的a
,aa
,aaa
等等
?
匹配零或之前定义的模式之一。
- 例如模式可能或可能不存在,但只能匹配一次。
- 例如。
*
匹配其前定义的模式的零个或多个。
- 例如可能存在或可能不存在的模式的通配符。
- 例如[az] *
匹配空字符串或小写字母的字符串。
/ code>匹配除换行符以外的任何字符
\\\
- 例如
a。匹配一个以a开头的两个字符串,并以除$
\\\
|
OR运算符
- 例如表示
a
或b
可以匹配 - 例如$>
^
NOT运算符
- 例如
[^ 0-9]
字符不能包含一个数字 - 例如。
[^ aA]
字符不能小写a
或大写A
\
转义以下特殊字符(覆盖上面行为)
- 例如
\。
,\\
,\(
,\?
,\ $
,\ ^
锚定模式:
^
匹配必须在字符串开始时出现
- 例如
^ a
第一个字符必须是小写字母a
- 例如
^ [0-9]
第一个字符必须是一个数字。
$
匹配必须在字符串末尾出现
- 例如
a $
最后一个字符必须是小写字母a
优先级表:
订单名称表示
1括号()
2乘数?+ * {m,n} {m,n}?
3顺序和锚点abc ^ $
4交替|
预定义字符缩写:
abr与意思相同
\d [0-9]任何单个数字
\D [^ 0-9]任何不是数字的单个字符
\w [a-zA-Z0-9_]任何字符
\W [^ a-zA-Z0-9_]任何非字词
\s [\r\t\\\
\f]任何空格字符
\S [^ \r\t\\\
\f]任何非空格字符
\\\
[\\\
]新行
示例1 :以宏运行
以下示例宏查看单元格
A1
中的值,以查看前1或2个字符是否为数字。如果是这样,它们被删除,并且显示字符串的其余部分。如果没有,则会显示一个框,告诉您没有找到匹配项。单元格A1
值12abc
将返回abc
,值1abc
将返回abc
,abc123
的值将返回不匹配,因为数字不在字符串的开头。Private Sub simpleRegex()
Dim strPattern As String:strPattern =^ [0-9] {1,2}
Dim strReplace As String:strReplace =
Dim regEx As New RegExp
Dim strInput As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range(A1)
如果strPattern<> 然后
strInput = Myrange.Value
带有regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
结束
如果regEx.Test(strInput)然后
MsgBox(regEx.Replace(strInput,strReplace))
Else
MsgBox(Not matching)
End If
End If
End Sub
示例2 :作为单元格内函数运行
此示例与示例1相同,但设置为作为单元内功能运行。要使用,请将代码更改为:
函数simpleCellRegex(Myrange As Range)As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern =^ [ 0-9] {1,3}
如果strPattern<> 然后
strInput = Myrange.Value
strReplace =
带有regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
结束
如果regEx.test(strInput)然后
simpleCellRegex = regEx.Replace(strInput,strReplace)
Else
simpleCellRegex =不匹配
结束如果
结束如果
结束函数
将您的字符串(12abc)放在单元格
A1
中。在单元格B1
中输入此公式= simpleCellRegex(A1)
,结果将为abc。
示例3 :循环范围
此示例与示例1相同,但循环遍历单元格范围。
Private Sub simpleRegex()
Dim strPattern As String:strPattern =^ [0-9] {1,2}
Dim strReplace As String:strReplace =
Dim regEx As New RegExp
Dim strInput As String
Dim Myrange As Range
设置Myrange = ActiveSheet.Range(A1:A5)
对于每个单元格在Myrange
如果strPattern< ;> 然后
strInput = cell.Value
与regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
结束
如果regEx.Test(strInput)然后
MsgBox(regEx.Replace(strInput,strReplace))
Else
MsgBox(Not matching)
End If
End If
Next
End Sub
示例4 :拆分不同的模式
示例循环遍历一个范围(
A1
,A2
&A3
),并查找一个以三位数字开头的字符串,后跟一个单个的字母,然后是4位数字。输出通过使用()
将模式匹配拆分为相邻单元格。$ 1
表示第一组()
中匹配的第一个模式。Private Sub splitUpRegexPattern()
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace作为String
Dim Myrange As Range
设置Myrange = ActiveSheet.Range(A1:A3)
对于每个C在Myrange
strPattern = (^ [0-9] {3})([a-zA-Z])([0-9] {4})
如果strPattern& 然后
strInput = C.Value
strReplace =$ 1
带有regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
结束
如果regEx.test(strInput)然后
C.Offset(0,1)= regEx。替换(strInput,$ 1)
C.Offset(0,2)= regEx.Replace(strInput,$ 2)
C.Offset(0,3)= regEx.Replace(strInput, $ 3)
Else
C.Offset(0,1)=(不匹配)
结束如果
结束如果
下一个
结束Sub
结果:
其他模式示例
String正则表达式模式说明在
a1aaa [a-zA-Z] [0-9] [a-zA-Z] {3}单个阿尔法,单数字,三个字母字符
a1aaa [a-zA-Z]? [0-9] [a-zA-Z] {3}可能或可能没有以前的字符字符
a1aaa [a-zA-Z] [0-9] [a-zA-Z] {0, 3}单个阿尔法,单数字,0到3个字母字符
a1aaa [a-zA-Z] [0-9] [a-zA-Z] *单个alpha,单个数字,后跟任意数量的alpha字符
< / i8> \< \ / [A-ZA-Z] [0-9] \>精确的非字符字符除了任何单个alpha后跟任何单个数字
How can I use regular expressions in Excel and take advantage of Excel's powerful grid like setup for data manipulation?
- In-cell function to return matched pattern or replaced value in string.
- Sub to loop through a column of data and extract matches to adjacent cells.
- What setup is necessary?
- What are Excel's special characters for Regex expressions?
I understand Regex is not ideal for many situations (To use or not to use regular expressions?) since excel can use
Left
,Mid
,Right
,Instr
type commands for similar manipulations.解决方案Regular expressions are used for Pattern Matching.
To use in Excel follow these steps :
Step 1: Add VBA reference to "Microsoft VBScript Regular Expressions 5.5"
- Select "Developer" tab (I don't have this tab what do I do?)
- Select "Visual Basic" icon from 'Code' ribbon section
- In "Microsoft Visual Basic for Applications" window select "Tools" from the top menu.
- Select "References"
- Check the box next to "Microsoft VBScript Regular Expressions 5.5" to include in your workbook.
- Click "OK"
Step 2: Define your pattern
Basic definitions:
-
Range.- E.g.
a-z
matches an lower case letters from a to z - E.g.
0-5
matches any number from 0 to 5
[]
Match exactly one of the objects inside these brackets.- E.g.
[a]
matches the letter a - E.g.
[abc]
matches a single letter which can be a, b or c - E.g.
[a-z]
matches any single lower case letter of the alphabet.
()
Groups different matches for return purposes. See examples below.{}
Multiplier for repeated copies of pattern defined before it.- E.g.
[a]{2}
matches two consecutive lower case letter a:aa
- E.g.
[a]{1,3}
matches at least one and up to three lower case lettera
,aa
,aaa
+
Match at least one, or more, of the pattern defined before it.- E.g.
a+
will match consecutive a'sa
,aa
,aaa
, and so on
?
Match zero or one of the pattern defined before it.- E.g. Pattern may or may not be present but can only be matched one time.
- E.g.
[a-z]?
matches empty string or any single lower case letter.
*
Match zero or more of the pattern defined before it. - E.g. Wildcard for pattern that may or may not be present. - E.g.[a-z]*
matches empty string or string of lower case letters..
Matches any character except newline\n
- E.g.
a.
Matches a two character string starting with a and ending with anything except\n
|
OR operator- E.g.
a|b
means eithera
orb
can be matched. - E.g.
red|white|orange
matches exactly one of the colors.
^
NOT operator- E.g.
[^0-9]
character can not contain a number - E.g.
[^aA]
character can not be lower casea
or upper caseA
\
Escapes special character that follows (overrides above behavior)- E.g.
\.
,\\
,\(
,\?
,\$
,\^
Anchoring Patterns:
^
Match must occur at start of string- E.g.
^a
First character must be lower case lettera
- E.g.
^[0-9]
First character must be a number.
$
Match must occur at end of string- E.g.
a$
Last character must be lower case lettera
Precedence table:
Order Name Representation 1 Parentheses ( ) 2 Multipliers ? + * {m,n} {m, n}? 3 Sequence & Anchors abc ^ $ 4 Alternation |
Predefined Character Abbreviations:
abr same as meaning \d [0-9] Any single digit \D [^0-9] Any single character that's not a digit \w [a-zA-Z0-9_] Any word character \W [^a-zA-Z0-9_] Any non-word character \s [ \r\t\n\f] Any space character \S [^ \r\t\n\f] Any non-space character \n [\n] New line
Example 1: Run as macro
The following example macro looks at the value in cell
A1
to see if the first 1 or 2 characters are digits. If so, they are removed and the rest of the string is displayed. If not, then a box appears telling you that no match is found. CellA1
values of12abc
will returnabc
, value of1abc
will returnabc
, value ofabc123
will return "Not Matched" because the digits were not at the start of the string.Private Sub simpleRegex() Dim strPattern As String: strPattern = "^[0-9]{1,2}" Dim strReplace As String: strReplace = "" Dim regEx As New RegExp Dim strInput As String Dim Myrange As Range Set Myrange = ActiveSheet.Range("A1") If strPattern <> "" Then strInput = Myrange.Value With regEx .Global = True .MultiLine = True .IgnoreCase = False .Pattern = strPattern End With If regEx.Test(strInput) Then MsgBox (regEx.Replace(strInput, strReplace)) Else MsgBox ("Not matched") End If End If End Sub
Example 2: Run as an in-cell function
This example is the same as example 1 but is setup to run as an in-cell function. To use, change the code to this:
Function simpleCellRegex(Myrange As Range) As String Dim regEx As New RegExp Dim strPattern As String Dim strInput As String Dim strReplace As String Dim strOutput As String strPattern = "^[0-9]{1,3}" If strPattern <> "" Then strInput = Myrange.Value strReplace = "" With regEx .Global = True .MultiLine = True .IgnoreCase = False .Pattern = strPattern End With If regEx.test(strInput) Then simpleCellRegex = regEx.Replace(strInput, strReplace) Else simpleCellRegex = "Not matched" End If End If End Function
Place your strings ("12abc") in cell
A1
. Enter this formula=simpleCellRegex(A1)
in cellB1
and the result will be "abc".
Example 3: Loop Through Range
This example is the same as example 1 but loops through a range of cells.
Private Sub simpleRegex() Dim strPattern As String: strPattern = "^[0-9]{1,2}" Dim strReplace As String: strReplace = "" Dim regEx As New RegExp Dim strInput As String Dim Myrange As Range Set Myrange = ActiveSheet.Range("A1:A5") For Each cell In Myrange If strPattern <> "" Then strInput = cell.Value With regEx .Global = True .MultiLine = True .IgnoreCase = False .Pattern = strPattern End With If regEx.Test(strInput) Then MsgBox (regEx.Replace(strInput, strReplace)) Else MsgBox ("Not matched") End If End If Next End Sub
Example 4: Splitting apart different patterns
This example loops through a range (
A1
,A2
&A3
) and looks for a string starting with three digits followed by a single alpha character and then 4 numeric digits. The output splits apart the pattern matches into adjacent cells by using the()
.$1
represents the first pattern matched within the first set of()
.Private Sub splitUpRegexPattern() Dim regEx As New RegExp Dim strPattern As String Dim strInput As String Dim strReplace As String Dim Myrange As Range Set Myrange = ActiveSheet.Range("A1:A3") For Each C In Myrange strPattern = "(^[0-9]{3})([a-zA-Z])([0-9]{4})" If strPattern <> "" Then strInput = C.Value strReplace = "$1" With regEx .Global = True .MultiLine = True .IgnoreCase = False .Pattern = strPattern End With If regEx.test(strInput) Then C.Offset(0, 1) = regEx.Replace(strInput, "$1") C.Offset(0, 2) = regEx.Replace(strInput, "$2") C.Offset(0, 3) = regEx.Replace(strInput, "$3") Else C.Offset(0, 1) = "(Not matched)" End If End If Next End Sub
Results:
Additional Pattern Examples
String Regex Pattern Explanation a1aaa [a-zA-Z][0-9][a-zA-Z]{3} Single alpha, single digit, three alpha characters a1aaa [a-zA-Z]?[0-9][a-zA-Z]{3} May or may not have preceeding alpha character a1aaa [a-zA-Z][0-9][a-zA-Z]{0,3} Single alpha, single digit, 0 to 3 alpha characters a1aaa [a-zA-Z][0-9][a-zA-Z]* Single alpha, single digit, followed by any number of alpha characters </i8> \<\/[a-zA-Z][0-9]\> Exact non-word character except any single alpha followed by any single digit
这篇关于如何在Microsoft Excel中使用正则表达式(正则表达式)in-cell和循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- 例如