如何在Microsoft Excel中使用正则表达式(正则表达式)in-cell和循环 [英] How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops

查看:145
本文介绍了如何在Microsoft Excel中使用正则表达式(正则表达式)in-cell和循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




  • 单元格内的功能返回匹配的模式或替换字符串中的值。

  • Sub循环遍历一列数据,并将匹配提取到相邻的单元格。

  • 必要?

  • 正则表达式的Excel特殊字符是什么?






我理解正则表达式对于许多情况来说不是理想的(使用或不使用正则表达式?)因为excel可以使用 Left Mid Right Instr 类似的操作类型命令。

正则表达式用于模式匹配。



要在Excel中使用,请按照以下步骤操作:



步骤1 :添加VBA引用Microsoft VBScript正则表达式5.5




  • 选择开发人员选项卡(我没有这个标签我该怎么办?

  • 从代码中选择Visual Basic图标功能区

  • 在Microsoft Visual Basic for Applications窗口中,从顶部菜单中选择工具。

  • 选择参考

  • 选中Microsoft VBScript Regular Expressions 5.5旁边的框以包含在您的工作簿中。

  • 单击确定



步骤2 :定义您的模式



基本定义: em>



- 范围。




  • 例如。 a-z 匹配从a到z的小写字母

  • 例如。 0-5 匹配从0到5的任何数字



[] 匹配这些方括号内的一个对象。




  • 例如 [a] 匹配字母a

  • 例如。 [abc] 匹配单字母,可以是a,b或c

  • 例如。


()组合不同的匹配返回目的。



{} 在其之前定义的模式的重复副本的乘数。




  • 例如 [a] {2} 连续两个小写字母a: aa

  • EG [a] {1,3} 匹配至少一个最多三个小写字母 a aa aaa



+ 匹配在它之前定义的模式中至少一个或多个。




  • 例如 a + 将匹配连续的a的 a aa aaa 等等



匹配零或之前定义的模式之一。




  • 例如模式可能或可能不存在,但只能匹配一次。

  • 例如。
    * 匹配其前定义的模式的零个或多个。
    - 例如可能存在或可能不存在的模式的通配符。
    - 例如 [az] * 匹配空字符串或小写字母的字符串。



    / code>匹配除换行符以外的任何字符 \\\




    • 例如 a。匹配一个以a开头的两个字符串,并以除$ \\\



    | OR运算符




    • 例如表示 a b 可以匹配

    • 例如


    ^ NOT运算符




    • 例如 [^ 0-9] 字符不能包含一个数字

    • 例如。 [^ aA] 字符不能小写 a 或大写 A



    \ 转义以下特殊字符(覆盖上面行为)




    • 例如 \。 \\ \( \? \ $ \ ^






    锚定模式:



    ^ 匹配必须在字符串开始时出现




    • 例如 ^ a 第一个字符必须是小写字母 a

    • 例如 ^ [0-9] 第一个字符必须是一个数字。



    $ 匹配必须在字符串末尾出现




    • 例如 a $ 最后一个字符必须是小写字母 a






    优先级表:

     订单名称表示
    1括号()
    2乘数?+ * {m,n} {m,n}?
    3顺序和锚点abc ^ $
    4交替|






    预定义字符缩写:

      abr与意思相同
    \d [0-9]任何单个数字
    \D [^ 0-9]任何不是数字的单个字符
    \w [a-zA-Z0-9_]任何字符
    \W [^ a-zA-Z0-9_]任何非字词
    \s [\r\t\\\
    \f]任何空格字符
    \S [^ \r\t\\\
    \f]任何非空格字符
    \\\
    [\\\
    ]新行






    示例1 以宏运行



    以下示例宏查看单元格 A1 中的值,以查看前1或2个字符是否为数字。如果是这样,它们被删除,并且显示字符串的其余部分。如果没有,则会显示一个框,告诉您没有找到匹配项。单元格 A1 12abc 将返回 abc ,值 1abc 将返回 abc abc123 的值将返回不匹配,因为数字不在字符串的开头。

      Private Sub simpleRegex()
    Dim strPattern As String:strPattern =^ [0-9] {1,2}
    Dim strReplace As String:strReplace =
    Dim regEx As New RegExp
    Dim strInput As String
    Dim Myrange As Range

    Set Myrange = ActiveSheet.Range(A1)

    如果strPattern<> 然后
    strInput = Myrange.Value

    带有regEx
    .Global = True
    .MultiLine = True
    .IgnoreCase = False
    .Pattern = strPattern
    结束

    如果regEx.Test(strInput)然后
    MsgBox(regEx.Replace(strInput,strReplace))
    Else
    MsgBox(Not matching)
    End If
    End If
    End Sub






    示例2 作为单元格内函数运行



    此示例与示例1相同,但设置为作为单元内功能运行。要使用,请将代码更改为:

     函数simpleCellRegex(Myrange As Range)As String 
    Dim regEx As New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim strReplace As String
    Dim strOutput As String


    strPattern =^ [ 0-9] {1,3}

    如果strPattern<> 然后
    strInput = Myrange.Value
    strReplace =

    带有regEx
    .Global = True
    .MultiLine = True
    .IgnoreCase = False
    .Pattern = strPattern
    结束

    如果regEx.test(strInput)然后
    simpleCellRegex = regEx.Replace(strInput,strReplace)
    Else
    simpleCellRegex =不匹配
    结束如果
    结束如果
    结束函数

    将您的字符串(12abc)放在单元格 A1 中。在单元格 B1 中输入此公式 = simpleCellRegex(A1),结果将为abc。








    示例3 循环范围



    此示例与示例1相同,但循环遍历单元格范围。

      Private Sub simpleRegex()
    Dim strPattern As String:strPattern =^ [0-9] {1,2}
    Dim strReplace As String:strReplace =
    Dim regEx As New RegExp
    Dim strInput As String
    Dim Myrange As Range

    设置Myrange = ActiveSheet.Range(A1:A5)

    对于每个单元格在Myrange
    如果strPattern< ;> 然后
    strInput = cell.Value

    与regEx
    .Global = True
    .MultiLine = True
    .IgnoreCase = False
    .Pattern = strPattern
    结束

    如果regEx.Test(strInput)然后
    MsgBox(regEx.Replace(strInput,strReplace))
    Else
    MsgBox(Not matching)
    End If
    End If
    Next
    End Sub






    示例4 :拆分不同的模式



    示例循环遍历一个范围( A1 A2 & A3 ),并查找一个以三位数字开头的字符串,后跟一个单个的字母,然后是4位数字。输出通过使用()将模式匹配拆分为相邻单元格。 $ 1 表示第一组()中匹配的第一个模式。

      Private Sub splitUpRegexPattern()
    Dim regEx As New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim strReplace作为String
    Dim Myrange As Range

    设置Myrange = ActiveSheet.Range(A1:A3)

    对于每个C在Myrange
    strPattern = (^ [0-9] {3})([a-zA-Z])([0-9] {4})

    如果strPattern& 然后
    strInput = C.Value
    strReplace =$ 1

    带有regEx
    .Global = True
    .MultiLine = True
    .IgnoreCase = False
    .Pattern = strPattern
    结束

    如果regEx.test(strInput)然后
    C.Offset(0,1)= regEx。替换(strInput,$ 1)
    C.Offset(0,2)= regEx.Replace(strInput,$ 2)
    C.Offset(0,3)= regEx.Replace(strInput, $ 3)
    Else
    C.Offset(0,1)=(不匹配)
    结束如果
    结束如果
    下一个
    结束Sub

    结果:








    其他模式示例

      String正则表达式模式说明在
    a1aaa [a-zA-Z] [0-9] [a-zA-Z] {3}单个阿尔法,单数字,三个字母字符
    a1aaa [a-zA-Z]? [0-9] [a-zA-Z] {3}可能或可能没有以前的字符字符
    a1aaa [a-zA-Z] [0-9] [a-zA-Z] {0, 3}单个阿尔法,单数字,0到3个字母字符
    a1aaa [a-zA-Z] [0-9] [a-zA-Z] *单个alpha,单个数字,后跟任意数量的alpha字符

    < / i8> \< \ / [A-ZA-Z] [0-9] \>精确的非字符字符除了任何单个alpha后跟任何单个数字


    How can I use regular expressions in Excel and take advantage of Excel's powerful grid like setup for data manipulation?

    • In-cell function to return matched pattern or replaced value in string.
    • Sub to loop through a column of data and extract matches to adjacent cells.
    • What setup is necessary?
    • What are Excel's special characters for Regex expressions?

    I understand Regex is not ideal for many situations (To use or not to use regular expressions?) since excel can use Left, Mid, Right, Instr type commands for similar manipulations.

    解决方案

    Regular expressions are used for Pattern Matching.

    To use in Excel follow these steps :

    Step 1: Add VBA reference to "Microsoft VBScript Regular Expressions 5.5"

    • Select "Developer" tab (I don't have this tab what do I do?)
    • Select "Visual Basic" icon from 'Code' ribbon section
    • In "Microsoft Visual Basic for Applications" window select "Tools" from the top menu.
    • Select "References"
    • Check the box next to "Microsoft VBScript Regular Expressions 5.5" to include in your workbook.
    • Click "OK"

    Step 2: Define your pattern

    Basic definitions:

    - Range.

    • E.g. a-z matches an lower case letters from a to z
    • E.g. 0-5 matches any number from 0 to 5

    [] Match exactly one of the objects inside these brackets.

    • E.g. [a] matches the letter a
    • E.g. [abc] matches a single letter which can be a, b or c
    • E.g. [a-z] matches any single lower case letter of the alphabet.

    () Groups different matches for return purposes. See examples below.

    {} Multiplier for repeated copies of pattern defined before it.

    • E.g. [a]{2} matches two consecutive lower case letter a: aa
    • E.g. [a]{1,3} matches at least one and up to three lower case letter a, aa, aaa

    + Match at least one, or more, of the pattern defined before it.

    • E.g. a+ will match consecutive a's a, aa, aaa, and so on

    ? Match zero or one of the pattern defined before it.

    • E.g. Pattern may or may not be present but can only be matched one time.
    • E.g. [a-z]? matches empty string or any single lower case letter.

    * Match zero or more of the pattern defined before it. - E.g. Wildcard for pattern that may or may not be present. - E.g. [a-z]* matches empty string or string of lower case letters.

    . Matches any character except newline \n

    • E.g. a. Matches a two character string starting with a and ending with anything except \n

    | OR operator

    • E.g. a|b means either a or b can be matched.
    • E.g. red|white|orange matches exactly one of the colors.

    ^ NOT operator

    • E.g. [^0-9] character can not contain a number
    • E.g. [^aA] character can not be lower case a or upper case A

    \ Escapes special character that follows (overrides above behavior)

    • E.g. \., \\, \(, \?, \$, \^

    Anchoring Patterns:

    ^ Match must occur at start of string

    • E.g. ^a First character must be lower case letter a
    • E.g. ^[0-9] First character must be a number.

    $ Match must occur at end of string

    • E.g. a$ Last character must be lower case letter a

    Precedence table:

    Order  Name                Representation
    1      Parentheses         ( )
    2      Multipliers         ? + * {m,n} {m, n}?
    3      Sequence & Anchors  abc ^ $
    4      Alternation         |
    


    Predefined Character Abbreviations:

    abr    same as       meaning
    \d     [0-9]         Any single digit
    \D     [^0-9]        Any single character that's not a digit
    \w     [a-zA-Z0-9_]  Any word character
    \W     [^a-zA-Z0-9_] Any non-word character
    \s     [ \r\t\n\f]   Any space character
    \S     [^ \r\t\n\f]  Any non-space character
    \n     [\n]          New line
    


    Example 1: Run as macro

    The following example macro looks at the value in cell A1 to see if the first 1 or 2 characters are digits. If so, they are removed and the rest of the string is displayed. If not, then a box appears telling you that no match is found. Cell A1 values of 12abc will return abc, value of 1abc will return abc, value of abc123 will return "Not Matched" because the digits were not at the start of the string.

    Private Sub simpleRegex()
        Dim strPattern As String: strPattern = "^[0-9]{1,2}"
        Dim strReplace As String: strReplace = ""
        Dim regEx As New RegExp
        Dim strInput As String
        Dim Myrange As Range
    
        Set Myrange = ActiveSheet.Range("A1")
    
        If strPattern <> "" Then
            strInput = Myrange.Value
    
            With regEx
                .Global = True
                .MultiLine = True
                .IgnoreCase = False
                .Pattern = strPattern
            End With
    
            If regEx.Test(strInput) Then
                MsgBox (regEx.Replace(strInput, strReplace))
            Else
                MsgBox ("Not matched")
            End If
        End If
    End Sub
    


    Example 2: Run as an in-cell function

    This example is the same as example 1 but is setup to run as an in-cell function. To use, change the code to this:

    Function simpleCellRegex(Myrange As Range) As String
        Dim regEx As New RegExp
        Dim strPattern As String
        Dim strInput As String
        Dim strReplace As String
        Dim strOutput As String
    
    
        strPattern = "^[0-9]{1,3}"
    
        If strPattern <> "" Then
            strInput = Myrange.Value
            strReplace = ""
    
            With regEx
                .Global = True
                .MultiLine = True
                .IgnoreCase = False
                .Pattern = strPattern
            End With
    
            If regEx.test(strInput) Then
                simpleCellRegex = regEx.Replace(strInput, strReplace)
            Else
                simpleCellRegex = "Not matched"
            End If
        End If
    End Function
    

    Place your strings ("12abc") in cell A1. Enter this formula =simpleCellRegex(A1) in cell B1 and the result will be "abc".


    Example 3: Loop Through Range

    This example is the same as example 1 but loops through a range of cells.

    Private Sub simpleRegex()
        Dim strPattern As String: strPattern = "^[0-9]{1,2}"
        Dim strReplace As String: strReplace = ""
        Dim regEx As New RegExp
        Dim strInput As String
        Dim Myrange As Range
    
        Set Myrange = ActiveSheet.Range("A1:A5")
    
        For Each cell In Myrange
            If strPattern <> "" Then
                strInput = cell.Value
    
                With regEx
                    .Global = True
                    .MultiLine = True
                    .IgnoreCase = False
                    .Pattern = strPattern
                End With
    
                If regEx.Test(strInput) Then
                    MsgBox (regEx.Replace(strInput, strReplace))
                Else
                    MsgBox ("Not matched")
                End If
            End If
        Next
    End Sub
    


    Example 4: Splitting apart different patterns

    This example loops through a range (A1, A2 & A3) and looks for a string starting with three digits followed by a single alpha character and then 4 numeric digits. The output splits apart the pattern matches into adjacent cells by using the (). $1 represents the first pattern matched within the first set of ().

    Private Sub splitUpRegexPattern()
        Dim regEx As New RegExp
        Dim strPattern As String
        Dim strInput As String
        Dim strReplace As String
        Dim Myrange As Range
    
        Set Myrange = ActiveSheet.Range("A1:A3")
    
        For Each C In Myrange
            strPattern = "(^[0-9]{3})([a-zA-Z])([0-9]{4})"
    
            If strPattern <> "" Then
                strInput = C.Value
                strReplace = "$1"
    
                With regEx
                    .Global = True
                    .MultiLine = True
                    .IgnoreCase = False
                    .Pattern = strPattern
                End With
    
                If regEx.test(strInput) Then
                    C.Offset(0, 1) = regEx.Replace(strInput, "$1")
                    C.Offset(0, 2) = regEx.Replace(strInput, "$2")
                    C.Offset(0, 3) = regEx.Replace(strInput, "$3")
                Else
                    C.Offset(0, 1) = "(Not matched)"
                End If
            End If
        Next
    End Sub
    

    Results:


    Additional Pattern Examples

    String   Regex Pattern                  Explanation
    a1aaa    [a-zA-Z][0-9][a-zA-Z]{3}       Single alpha, single digit, three alpha characters
    a1aaa    [a-zA-Z]?[0-9][a-zA-Z]{3}      May or may not have preceeding alpha character
    a1aaa    [a-zA-Z][0-9][a-zA-Z]{0,3}     Single alpha, single digit, 0 to 3 alpha characters
    a1aaa    [a-zA-Z][0-9][a-zA-Z]*         Single alpha, single digit, followed by any number of alpha characters
    
    </i8>    \<\/[a-zA-Z][0-9]\>            Exact non-word character except any single alpha followed by any single digit
    

    这篇关于如何在Microsoft Excel中使用正则表达式(正则表达式)in-cell和循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆