这是RegEx,用于匹配Excel公式中的任何单元格引用? [英] Is this the RegEx for matching any cell reference in an Excel formula?

查看:126
本文介绍了这是RegEx,用于匹配Excel公式中的任何单元格引用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试创建一个正则表达式模式,匹配任何Excel公式中的任何引用,包括绝对,相对和外部引用。我需要返回整个引用,包括工作表和工作簿名称。

I have been trying to create a regular expressions pattern that matches any reference in any Excel formula, including absolute, relative, and external references. I need to return the entire reference, including the worksheet and workbook name.

我无法找到关于Excel A1表示法的详尽文档,但有很多的测试我确定了以下内容:

I haven't been able to find exhaustive documentation about Excel A1-notation, but with a lot of testing I have determined the following:


  • 公式前面加上等号=

  • = A1&A1将会破坏正则表达式
    工作表名称最多可包含31个字符,不包括\ /? * []:
  • 外部引用中的工作表名称必须成功与bang = Sheet1!A1

  • 外部引用中的工作簿名称必须括在方括号 = [Book1.xlsx] Sheet1!A1

  • 工作簿路径,如果引用是在一个封闭的工作簿中的一个范围,Excel将添加,它们始终以单引号括起来,并在工作簿名称'C:\ [Book1.xlsx] Sheet1' !A1

  • 某些字符(例如,不间断的空格)会导致Excel将工作簿和工作表名称以单引号括起来, >但我不知道哪些角色 ='[Book 1.xlsx] Sheet 1'!A1

  • 即使启用了R1C1表示法, Range.Formula 仍然以A1表示法返回引用。 Range.FormulaR1C1 返回R1C1表示法中的引用。
  • 3D参考样式允许在一个工作簿上的工作表名称范围 = SUM([Book5] Sheet1:Sheet3!A1)

  • 可以在公式中指定命名范围:


    • 名称的第一个字符必须是字母,下划线字符(_)或一个反斜杠(\)。名称中的剩余字符可以是字母,数字,句点和下划线字符。

    • 您不能使用大写和小写字符C,c,R或 r作为定义的名称,因为当您在名称或转到文本框中输入时,它们都用作选择当前所选单元格的行或列的缩写。

    • 名称不能与单元格引用相同,例如Z $ 100或R1C1。

    • 不允许空格作为名称的一部分。

    • A名称最多可以包含255个字符。

    • 名称可以包含大写和小写字母。 Excel不区分名称中的大小写字符。

    • Formulas are preceded with an equal sign "="
    • Strings within formulas are enclosed in double quotes and need to be removed before looking for real references, otherwise =A1&"A1" would break regex
    • Worksheet names can be up to 31 characters long, excluding \ / ? * [ ] :
    • Worksheet names in external references must be succeeded with bang =Sheet1!A1
    • Workbook names in external references must be enclosed in square brackets =[Book1.xlsx]Sheet1!A1
    • Workbook paths, which Excel adds if a reference is to a range in a closed workbook, are always enclosed in single quotes and to the left of the brackets for the workbook name 'C:\[Book1.xlsx]Sheet1'!A1
    • Some characters (non-breaking space, for example) cause Excel to enclose the workbook and worksheet name in an external reference in single quotes, but I don't know specifically which characters ='[Book 1.xlsx]Sheet 1'!A1
    • Even if R1C1-notation is enabled, Range.Formula still returns references in A1-notation. Range.FormulaR1C1 returns references in R1C1 notation.
    • 3D reference style allows a range of sheet names on one workbook =SUM([Book5]Sheet1:Sheet3!A1)
    • Named ranges can be specified in formulas:
      • The first character of a name must be a letter, an underscore character (_), or a backslash (\). Remaining characters in the name can be letters, numbers, periods, and underscore characters.
      • You cannot use the uppercase and lowercase characters "C", "c", "R", or "r" as a defined name, because they are all used as a shorthand for selecting a row or column for the currently selected cell when you enter them in a Name or Go To text box.
      • Names cannot be the same as a cell reference, such as Z$100 or R1C1.
      • Spaces are not allowed as part of a name.
      • A name can be up to 255 characters in length.
      • Names can contain uppercase and lowercase letters. Excel does not distinguish between uppercase and lowercase characters in names.

      这是我想出的包装在VBA程序中进行测试。我更新了代码以处理名字:

      Here is what I came up with wrapped in a VBA procedure for testing. I updated the code to handle names as well:

      Sub ReturnFormulaReferences()
      
          Dim objRegExp As New VBScript_RegExp_55.RegExp
          Dim objCell As Range
          Dim objStringMatches As Object
          Dim objReferenceMatches As Object
          Dim objMatch As Object
          Dim intReferenceCount As Integer
          Dim intIndex As Integer
          Dim booIsReference As Boolean
          Dim objName As Name
          Dim booNameFound As Boolean
      
          With objRegExp
              .MultiLine = True
              .Global = True
              .IgnoreCase = True
          End With
      
          For Each objCell In Selection.Cells
              If Left(objCell.Formula, 1) = "=" Then
      
                  objRegExp.Pattern = "\"".*\"""
                  Set objStringMatches = objRegExp.Execute(objCell.Formula)
      
                  objRegExp.Pattern = "(\'.*(\[.*\])?([^\:\\\/\?\*\[\]]{1,31}\:)?[^\:\\\/\?\*\[\]]{1,31}\'\!" _
                  & "|(\[.*\])?([^\:\\\/\?\*\[\]]{1,31}\:)?[^\:\\\/\?\*\[\]]{1,31}\!)?" _
                  & "(\$?[a-z]{1,3}\$?[0-9]{1,7}(\:\$?[a-z]{1,3}\$?[0-9]{1,7})?" _
                  & "|\$[a-z]{1,3}\:\$[a-z]{1,3}" _
                  & "|[a-z]{1,3}\:[a-z]{1,3}" _
                  & "|\$[0-9]{1,7}\:\$[0-9]{1,7}" _
                  & "|[0-9]{1,7}\:[0-9]{1,7}" _
                  & "|[a-z_\\][a-z0-9_\.]{0,254})"
                  Set objReferenceMatches = objRegExp.Execute(objCell.Formula)
      
                  intReferenceCount = 0
                  For Each objMatch In objReferenceMatches
                      intReferenceCount = intReferenceCount + 1
                  Next
      
                  Debug.Print objCell.Formula
                  For intIndex = intReferenceCount - 1 To 0 Step -1
                      booIsReference = True
                      For Each objMatch In objStringMatches
                          If objReferenceMatches(intIndex).FirstIndex > objMatch.FirstIndex _
                          And objReferenceMatches(intIndex).FirstIndex < objMatch.FirstIndex + objMatch.Length Then
                              booIsReference = False
                              Exit For
                          End If
                      Next
      
                      If booIsReference Then
                          objRegExp.Pattern = "(\'.*(\[.*\])?([^\:\\\/\?\*\[\]]{1,31}\:)?[^\:\\\/\?\*\[\]]{1,31}\'\!" _
                          & "|(\[.*\])?([^\:\\\/\?\*\[\]]{1,31}\:)?[^\:\\\/\?\*\[\]]{1,31}\!)?" _
                          & "(\$?[a-z]{1,3}\$?[0-9]{1,7}(\:\$?[a-z]{1,3}\$?[0-9]{1,7})?" _
                          & "|\$[a-z]{1,3}\:\$[a-z]{1,3}" _
                          & "|[a-z]{1,3}\:[a-z]{1,3}" _
                          & "|\$[0-9]{1,7}\:\$[0-9]{1,7}" _
                          & "|[0-9]{1,7}\:[0-9]{1,7})"
                          If Not objRegExp.Test(objReferenceMatches(intIndex).Value) Then 'reference is not A1
                              objRegExp.Pattern = "^(\'.*(\[.*\])?([^\:\\\/\?\*\[\]]{1,31}\:)?[^\:\\\/\?\*\[\]]{1,31}\'\!" _
                              & "|(\[.*\])?([^\:\\\/\?\*\[\]]{1,31}\:)?[^\:\\\/\?\*\[\]]{1,31}\!)" _
                              & "[a-z_\\][a-z0-9_\.]{0,254}$"
                              If Not objRegExp.Test(objReferenceMatches(intIndex).Value) Then 'name is not external
                                  booNameFound = False
                                  For Each objName In objCell.Worksheet.Parent.Names
                                      If objReferenceMatches(intIndex).Value = objName.Name Then
                                          booNameFound = True
                                          Exit For
                                      End If
                                  Next
                                  If Not booNameFound Then
                                      objRegExp.Pattern = "^(\'.*(\[.*\])?([^\:\\\/\?\*\[\]]{1,31}\:)?[^\:\\\/\?\*\[\]]{1,31}\'\!" _
                                      & "|(\[.*\])?([^\:\\\/\?\*\[\]]{1,31}\:)?[^\:\\\/\?\*\[\]]{1,31}\!)"
                                      For Each objName In objCell.Worksheet.Names
                                          If objReferenceMatches(intIndex).Value = objRegExp.Replace(objName.Name, "") Then
                                              booNameFound = True
                                              Exit For
                                          End If
                                      Next
                                  End If
                                  booIsReference = booNameFound
                              End If
                          End If
                      End If
      
                      If booIsReference Then
                          Debug.Print "  " & objReferenceMatches(intIndex).Value _
                          & " (" & objReferenceMatches(intIndex).FirstIndex & ", " _
                          & objReferenceMatches(intIndex).Length & ")"
                      End If
                  Next intIndex
                  Debug.Print
      
              End If
          Next
      
          Set objRegExp = Nothing
          Set objStringMatches = Nothing
          Set objReferenceMatches = Nothing
          Set objMatch = Nothing
          Set objCell = Nothing
          Set objName = Nothing
      
      End Sub
      

      任何人都可以打破或改善这个?没有关于Excel公式语法的详尽文档,很难知道这是否正确。

      Can anyone break or improve this? Without exhaustive documentation on Excel's formula syntax it is difficult to know if this is correct.

      谢谢!

      推荐答案

      jtolle引导我走向正确的方向。据我所知,这正是我想要做的。我一直在测试,似乎有效。

      jtolle steered me in the right direction. As far as I can tell, this is what I was trying to do. I've been testing and it seems to work.

      stringOriginFormula = rangeOrigin.Formula
      rangeOrigin.Cut rangeDestination
      rangeOrigin.Formula = stringOriginFormula
      

      感谢jtolle!

      这篇关于这是RegEx,用于匹配Excel公式中的任何单元格引用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆