如何在单元格内和循环中使用 Microsoft Excel 中的正则表达式 (Regex) [英] How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
问题描述
如何在 Excel 中使用正则表达式并利用 Excel 强大的类似网格的设置进行数据操作?
How can I use regular expressions in Excel and take advantage of Excel's powerful grid-like setup for data manipulation?
- 单元格内函数返回字符串中匹配的模式或替换值.
- Sub 循环遍历一列数据并提取与相邻单元格的匹配项.
- 需要什么设置?
- Excel 的正则表达式特殊字符是什么?
我知道 Regex 在许多情况下并不理想(使用或不使用正则表达式?),因为 excel 可以使用 Left
、Mid
、Right
、Instr
类型的命令进行类似的操作.
I understand Regex is not ideal for many situations (To use or not to use regular expressions?) since excel can use Left
, Mid
, Right
, Instr
type commands for similar manipulations.
推荐答案
正则表达式被使用用于模式匹配.
要在 Excel 中使用,请按照下列步骤操作:
To use in Excel follow these steps:
步骤 1:添加对Microsoft VBScript 正则表达式 5.5"的 VBA 引用
Step 1: Add VBA reference to "Microsoft VBScript Regular Expressions 5.5"
- 选择开发者"选项卡(我没有这个选项卡我该怎么办?)
- 选择Visual Basic"来自代码"功能区部分的图标
- 在Microsoft Visual Basic for Applications"中窗口选择工具"从顶部菜单.
- 选择参考"
- 选中Microsoft VBScript 正则表达式 5.5"旁边的框;包含在您的工作簿中.
- 点击确定"
第 2 步:定义您的模式
基本定义:
-
范围.
- 例如
a-z
匹配从 a 到 z 的小写字母 - 例如
0-5
匹配从 0 到 5 的任何数字
- E.g.
a-z
matches an lower case letters from a to z - E.g.
0-5
matches any number from 0 to 5
[]
完全匹配这些括号内的对象之一.
[]
Match exactly one of the objects inside these brackets.
- 例如
[a]
匹配字母 a - 例如
[abc]
匹配单个字母,可以是 a、b 或 c - 例如
[a-z]
匹配字母表中的任何单个小写字母.
- E.g.
[a]
matches the letter a - E.g.
[abc]
matches a single letter which can be a, b or c - E.g.
[a-z]
matches any single lower case letter of the alphabet.
()
将不同的匹配分组以用于返回目的.请参阅下面的示例.
()
Groups different matches for return purposes. See examples below.
{}
之前定义的模式重复副本的乘数.
{}
Multiplier for repeated copies of pattern defined before it.
- 例如
[a]{2}
匹配两个连续的小写字母 a:aa
- 例如
[a]{1,3}
匹配至少一个和最多三个小写字母a
、aa
、aaa代码>
- E.g.
[a]{2}
matches two consecutive lower case letter a:aa
- E.g.
[a]{1,3}
matches at least one and up to three lower case lettera
,aa
,aaa
+
匹配至少一个或多个在它之前定义的模式.
+
Match at least one, or more, of the pattern defined before it.
- 例如
a+
将匹配连续的 a 的a
、aa
、aaa
等等
- E.g.
a+
will match consecutive a'sa
,aa
,aaa
, and so on
?
匹配零个或前面定义的模式之一.
?
Match zero or one of the pattern defined before it.
- 例如模式可能存在也可能不存在,但只能匹配一次.
- 例如
[a-z]?
匹配空字符串或任何单个小写字母.
- E.g. Pattern may or may not be present but can only be matched one time.
- E.g.
[a-z]?
matches empty string or any single lower case letter.
*
匹配零个或多个在它之前定义的模式.
*
Match zero or more of the pattern defined before it.
- 例如可能存在也可能不存在的模式的通配符.
- 例如
[a-z]*
匹配空字符串或小写字母字符串.
- E.g. Wildcard for pattern that may or may not be present.
- E.g.
[a-z]*
matches empty string or string of lower case letters.
.
匹配除换行符
.
Matches any character except newline
- 例如
a.
匹配以 a 开头并以除 以外的任何内容结尾的两个字符串
- E.g.
a.
Matches a two character string starting with a and ending with anything except
|
OR 运算符
- 例如
a|b
表示可以匹配a
或b
. - 例如
red|white|orange
与其中一种颜色完全匹配.
- E.g.
a|b
means eithera
orb
can be matched. - E.g.
red|white|orange
matches exactly one of the colors.
^
NOT 运算符
- 例如
[^0-9]
字符不能包含数字 - 例如
[^aA]
字符不能为小写a
或大写A
- E.g.
[^0-9]
character can not contain a number - E.g.
[^aA]
character can not be lower casea
or upper caseA
转义后面的特殊字符(覆盖上述行为)
Escapes special character that follows (overrides above behavior)
- 例如
.
,\
,(
,?
,$
,^
- E.g.
.
,\
,(
,?
,$
,^
锚定模式:
^
匹配必须出现在字符串的开头
^
Match must occur at start of string
- 例如
^a
第一个字符必须是小写字母a
- 例如
^[0-9]
第一个字符必须是数字.
- E.g.
^a
First character must be lower case lettera
- E.g.
^[0-9]
First character must be a number.
$
匹配必须出现在字符串的末尾
$
Match must occur at end of string
- 例如
a$
最后一个字符必须是小写字母a
- E.g.
a$
Last character must be lower case lettera
优先级表:
Order Name Representation
1 Parentheses ( )
2 Multipliers ? + * {m,n} {m, n}?
3 Sequence & Anchors abc ^ $
4 Alternation |
预定义的字符缩写:
abr same as meaning
d [0-9] Any single digit
D [^0-9] Any single character that's not a digit
w [a-zA-Z0-9_] Any word character
W [^a-zA-Z0-9_] Any non-word character
s [
f] Any space character
S [^
f] Any non-space character
[
] New line
示例 1:作为宏运行
以下示例宏查看单元格 A1
中的值,以查看前 1 或 2 个字符是否为数字.如果是这样,它们将被删除并显示字符串的其余部分.如果没有,则会出现一个框,告诉您未找到匹配项.12abc
的单元格A1
值将返回abc
,1abc
的值将返回abc
, abc123
的值将返回Not Matched";因为数字不在字符串的开头.
The following example macro looks at the value in cell A1
to see if the first 1 or 2 characters are digits. If so, they are removed and the rest of the string is displayed. If not, then a box appears telling you that no match is found. Cell A1
values of 12abc
will return abc
, value of 1abc
will return abc
, value of abc123
will return "Not Matched" because the digits were not at the start of the string.
Private Sub simpleRegex()
Dim strPattern As String: strPattern = "^[0-9]{1,2}"
Dim strReplace As String: strReplace = ""
Dim regEx As New RegExp
Dim strInput As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("A1")
If strPattern <> "" Then
strInput = Myrange.Value
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
MsgBox (regEx.Replace(strInput, strReplace))
Else
MsgBox ("Not matched")
End If
End If
End Sub
示例 2:作为内嵌函数运行
此示例与示例 1 相同,但设置为作为单元内函数运行.要使用,请将代码更改为:
This example is the same as example 1 but is setup to run as an in-cell function. To use, change the code to this:
Function simpleCellRegex(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "^[0-9]{1,3}"
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.test(strInput) Then
simpleCellRegex = regEx.Replace(strInput, strReplace)
Else
simpleCellRegex = "Not matched"
End If
End If
End Function
将您的字符串(12abc")放在单元格 A1
中.在单元格 B1
中输入此公式 =simpleCellRegex(A1)
,结果将为abc".
Place your strings ("12abc") in cell A1
. Enter this formula =simpleCellRegex(A1)
in cell B1
and the result will be "abc".
示例 3:循环范围
此示例与示例 1 相同,但会遍历一系列单元格.
This example is the same as example 1 but loops through a range of cells.
Private Sub simpleRegex()
Dim strPattern As String: strPattern = "^[0-9]{1,2}"
Dim strReplace As String: strReplace = ""
Dim regEx As New RegExp
Dim strInput As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("A1:A5")
For Each cell In Myrange
If strPattern <> "" Then
strInput = cell.Value
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
MsgBox (regEx.Replace(strInput, strReplace))
Else
MsgBox ("Not matched")
End If
End If
Next
End Sub
示例 4:拆分不同的模式
此示例循环遍历一个范围 (A1
, A2
& A3
) 并查找以三位数字开头的字符串,后跟单个字母字符,然后是 4 个数字.输出使用 ()
将模式匹配拆分为相邻的单元格.$1
表示在第一组 ()
中匹配的第一个模式.
This example loops through a range (A1
, A2
& A3
) and looks for a string starting with three digits followed by a single alpha character and then 4 numeric digits. The output splits apart the pattern matches into adjacent cells by using the ()
. $1
represents the first pattern matched within the first set of ()
.
Private Sub splitUpRegexPattern()
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("A1:A3")
For Each C In Myrange
strPattern = "(^[0-9]{3})([a-zA-Z])([0-9]{4})"
If strPattern <> "" Then
strInput = C.Value
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.test(strInput) Then
C.Offset(0, 1) = regEx.Replace(strInput, "$1")
C.Offset(0, 2) = regEx.Replace(strInput, "$2")
C.Offset(0, 3) = regEx.Replace(strInput, "$3")
Else
C.Offset(0, 1) = "(Not matched)"
End If
End If
Next
End Sub
结果:
其他模式示例
String Regex Pattern Explanation
a1aaa [a-zA-Z][0-9][a-zA-Z]{3} Single alpha, single digit, three alpha characters
a1aaa [a-zA-Z]?[0-9][a-zA-Z]{3} May or may not have preceding alpha character
a1aaa [a-zA-Z][0-9][a-zA-Z]{0,3} Single alpha, single digit, 0 to 3 alpha characters
a1aaa [a-zA-Z][0-9][a-zA-Z]* Single alpha, single digit, followed by any number of alpha characters
</i8> </[a-zA-Z][0-9]> Exact non-word character except any single alpha followed by any single digit
这篇关于如何在单元格内和循环中使用 Microsoft Excel 中的正则表达式 (Regex)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!