使用 VBA 从 XML 获取属性名称 [英] Obtain attribute names from XML using VBA

查看:40
本文介绍了使用 VBA 从 XML 获取属性名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要使用 VBA 从 XML 中获取不同的属性名称.

I need to get the distinct attributes names from the XML using VBA.

这是我的代码.

 sub test() 
 Dim XMLFile As Object
Dim XMLFileName As String
Set XMLFile = CreateObject("Microsoft.XMLDOM")

XMLFileName = "C:UsersInput.xml"
XMLFile.async = False
XMLFile.Load (XMLFileName)
XMLFile.validateOnParse = False

Dim mainnode As Object
Dim node As Object

Set mainnode = XMLFile.SelectNodes("//Elements")

For Each node In mainnode
    For Each child In node.ChildNodes
    Debug.Print child.BaseName
    Dim kiddo As Object
    For Each kiddo In child.ChildNodes
        Debug.Print kiddo.BaseName
    Next kiddo
Next child
Next node
End sub

这是示例 XML.我需要从 XML 中获取属性名称 num.

Here is the sample XML. I need to get the attribute name num from the XML.

<Elements>
<Details>
    <Name>ABC</Name>
    <Address>123ABC</Address>
    <College>
        <collname>safasf</collname>
         <collnumber/>
    </College>
</Details>  
<Dept num="123">
    <Deptname>IT</Deptname>
    <ID>A123</ID>
 </Dept>            
</Elements>

预期结果:

 Elements
 Details
 Name 
 Address
 College
 collname
 collnumber
 Dept
 num
 Deptname
 ID

以上代码的实际结果:

 Elements
 Details
 Name 
 Address
 College
 collname
 Dept
 Deptname
 ID

我的代码未获取num"属性和 <collnumber/> 标记.有人可以让我知道如何使用 VBA 从 XML 中获取属性名称和标签名称

The "num" attribute and <collnumber/> tag is not fetched by my code. Could someone let me know how to fetch the attribute names along with the tag names from XML using VBA

推荐答案

通过递归函数调用显示包含属性的 XML 结构

我的示例代码演示了一种方法

My example code demonstrates a way to

  • [1] 使用 XMLDOM 方法和
  • 将整个 XML 结构分配给一个二维数组
  • [2] 可选择将其写回工作表.
  • [1] assign the entire XML structure to a 2-dim array using XMLDOM methods and
  • [2] optionally write it back to a sheet.

放大提示:

我添加了这些 ► 结构化提示以提供更多帮助,而不是仅显示代码,正如我所说,其中许多要点也会导致其他用户重复提问:

I added these ► structured hints to offer more help than by displaying code only, as I remarked that many of these points lead to repeated questions by other users, too:

  • 试图列出 XML 结构,随着节点元素层次深度的增加,你失去了良好的视野(类型常量 1 NODE_ELEMENT),所以我紧急推荐使用 ► <本示例代码中使用的强>递归调用.
  • 此外,您可能没有考虑过节点文本(类型常量 3 NODE_TEXT)的特殊结构是命名的第一个子父元素 - cf主函数 listChildNodes 中的 A. 和 B. 部分.您通过子节点的循环不会区分提到的类型.只需研究引用函数中的注释即可了解详细信息.
  • 我想您的 XML 文件以所需的处理指令开头,例如<?xml version="1.0" encoding="utf-8"?>,这样才能真正识别为XML文件.
  • 调用过程 DisplayXML() 使用 后期绑定 而不是像您的帖子那样对 MS XML 的早期绑定引用,但使用推荐的 MSXML2 6.0 版.它通过它的 DocumentElement (BTW 单个节点元素)和一个引用预定义的 2-暗淡数组 v.
  • 版本控制:如果您使用 Set XDoc = CreateObject("MSXML2.DOMDocument")XMLFILE 对象设置为内存,通常您是获取较旧的默认版本 (3.0),因此在大多数情况下,最好使用显式 Set XDoc = CreateObject("MSXML2.DOMDocument.6.0") 代替(自动包括 XPath).
  • 如果您不使用Load 函数来获取True(文件加载成功)或False(加载错误)返回,不需要将文件名设置在方括号()中.
  • 搜索字符串中的 XPath 运算符 // 将返回 任何 级别的任何出现(参见 XMLFile.SelectNodes("//Elements") 在你的 OP 中).
  • 还要考虑使用 XSLT,这是一种专门用于将 XML 文件转换为各种最终使用格式的语言.
  • Trying to list XML structures you lose good view with increasing hierarchy depth of your node elements (type constant 1 NODE_ELEMENT), so I urgently recommend the use of ► recursive calls as used in this example code.
  • Furthermore you might have not considered the special construction of node text (type constant 3 NODE_TEXT) being the first child of a name giving parent element - c.f. sections A. and B. in main function listChildNodes. Your loops through child nodes would not distinguish between the mentioned types. Just study the comments in the cited function for details.
  • I suppose your XML file starts with a needed processing instruction like e.g. <?xml version="1.0" encoding="utf-8"?>, so that it can be actually identified as XML file.
  • The calling procedure DisplayXML() uses late binding instead of early bound reference to MS XML similar to your post, but uses the recommended MSXML2 version 6.0. It calls the main function via its DocumentElement <Elements> (BTW a single node element) and a second argument referring to a predefined 2-dim array v.
  • Versioning: If you would set your XMLFILE object to memory with Set XDoc = CreateObject("MSXML2.DOMDocument") generally you are getting the older default Version (3.0), so in most cases it's preferrable to use explicitly Set XDoc = CreateObject("MSXML2.DOMDocument.6.0") instead (including XPath automatically).
  • If you don't use the Load function to get a True (file loaded successfully) or False (load error) back, it is not necessary to set the file name into brackets ().
  • The XPath operator // in search strings would return any occurences at any level (c.f. XMLFile.SelectNodes("//Elements") in your OP).
  • Consider also the use of XSLT, a special-purpose language designed to tranform XML files into all kind of end-use formats.

调用过程DisplayXML

提示: 仅使用调用过程中的估计项数(例如 1000)来确定数组的行数就足够了,因为主函数执行 ReDim(包括双换位)在需要时自动进行.尽管如此,我从一开始就通过 XPath/XMLDOM 表达式 XMLFile.SelectNodes("//*").Length 计算整个文件中的任何项目,在这里添加了确切的项目计数.

Hint: It would sufficient to dimension the array 's row count only with an estimated number of items in the calling procedure (e.g. 1000), as the main function executes a ReDim (including a double transposition) automatically if needed. Nevertheless I added the exact items count here from the start via XPath/XMLDOM expression XMLFile.SelectNodes("//*").Length counting any item in the entire file.

Option Explicit          ' declaration head of your code module

Sub DisplayXML()
Dim XMLFile As Object
Dim XMLFileName As String
'Set XMLFile = CreateObject("Microsoft.XMLDOM")   ' former style not recommended
Set XMLFile = CreateObject("MSXML2.DOMDocument.6.0")

XMLFileName = "C:UsersInput.xml"                             ' << change to your xml file name
XMLFile.Async = False
XMLFile.ValidateOnParse = False
Debug.Print XMLFile.XML

If XMLFile.Load(XMLFileName) Then
' [1] write xml info to array with exact or assumed items count
  Dim v As Variant: ReDim v(1 To XMLFile.SelectNodes("//*").Length, 1 To 2)
  listChildNodes XMLFile.DocumentElement, v                 ' call helper function

' [2] write results to sheet "Dump"                         ' change to your sheet name
  With ThisWorkbook.Worksheets("Dump")
       .Range("A:B") = ""                                   ' clear result range
       .Range("A1:B1") = Split("XML Tag,Node Value", ",")   ' titles
       .Range("A2").Resize(UBound(v), UBound(v, 2)) = v     ' get  2-dim info array
  End With
Else
       MsgBox "Load Error " & XMLFileName
End If
Set XMLFile = Nothing
End Sub

在工作表中显示的结构化结果

提示:如果不想要层次缩进或枚举层次层次结构,可以很容易地适配下面的主函数listChildNodes().

Hint: If you don't want the level indentation or enumerated Level hierarchy, you can easily adapt the main function listChildNodes() below.

+----+---------------------+-----------------+
|    |         A           |       B         |
+----+---------------------+-----------------+
|1   | XML Tag             | Node Value      |
+----+---------------------+-----------------+
|2   | 0 Elements          |                 |
+----+---------------------+-----------------+
|3   |   1 Details         |                 |
+----+---------------------+-----------------+
|4   |     2 Name          | ABC             |
+----+---------------------+-----------------+
|5   |     2 Address       | 123ABC          |
+----+---------------------+-----------------+
|6   |     2 College       |                 |
+----+---------------------+-----------------+
|7   |       3 collname    | safasf          |
+----+---------------------+-----------------+
|8   |       3 collnumber  |                 |
+----+---------------------+-----------------+
|9   |   1 Dept[@num="123"]|                 |
+----+---------------------+-----------------+
|10  |     2 Deptname      | IT              |
+----+---------------------+-----------------+
|11  |     2 ID            | A123            |
+----+---------------------+-----------------+

也可以引用一个精确的节点元素,例如通过

It is also possible to refer to a precise node element, e.g. via

listChildNodes XMLFile.DocumentElement.SelectSingleNode("Dept[@num=""123""]"),v, 1, 1       ' starting from item no 1 and Level no 1

这将单独列出指定的节点集:

This would list the indicated node set alone:

+----+---------------------+-----------------+
|    |         A           |       B         |
+----+---------------------+-----------------+
|1   | XML Tag             | Node Value      |
+----+---------------------+-----------------+
|2   |   1 Dept[@num="123"]|                 |
+----+---------------------+-----------------+
|3   |     2 Deptname      | IT              |
+----+---------------------+-----------------+
|4   |     2 ID            | A123            |
+----+---------------------+-----------------+

递归主函数listChildNodes()

循环遍历子节点集合,此函数重复(递归")调用自身(即当前节点对象)并将整个 XML 结构分配给给定的二维数组(第二个参数).此外,它允许缩进并指示层次结构级别.注意这个例子中的数组必须是从 1 开始的.

Looping through childnode collections this function calls itself (i.e. the current node object) repeatedly ("recursively") and assigns the entire XML structure to a given 2-dim array (2nd argument). Furthermore it allows indendation and indicates the hierarchy levels. Note that the array in this example has to be 1-based.

Edit 20/8 2018 包括 数组大小自动增加,如果项目计数器 i 超过当前数组的上限(UBound(v),即在它的第一个维度 = 项目数).技术说明:由于这样的 ReDim 在次要(这里是第一个)维度中是不可能的,将行"(dim 1)更改为列"(dim 2)的中间换位是必要的.

Edit 20/8 2018 includes an automatic increase of array size if the items counter i exceeds the current array's upper boundary (UBound(v), i.e. in its first dimension = items count). Technical note: As such a ReDim isn't possible in a minor (here 1st) dimension, an intermediate transposition changing 'rows' (dim 1) to 'columns' (dim 2) is necessary.

Function listChildNodes(oCurrNode As Object, _
                        ByRef v As Variant, _
                        Optional ByRef i As Long = 1, _
                        Optional iLvl As Integer = 0 _
                        ) As Boolean
' Purpose: assign the complete node structure with contents to a 1-based 2-dim array
' Author:  T.M.
' Note: Late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
'       (1 ... NODE_ELEMENT, 2 ... NODE_ATTRIBUTE, 3 ... NODE_TEXT etc.)
' Escape
  If oCurrNode Is Nothing Then Exit Function
  If i < 1 Then i = 1                                       ' one based items Counter
' Edit 20/8 2018 - Automatic increase of array size if needed 
  If i >= UBound(v) Then                                    ' change array size if needed
     Dim tmp As Variant
     tmp = Application.Transpose(v)                         ' change rows to columns
     ReDim Preserve tmp(1 To 2, 1 To UBound(v) + 1000)      ' increase row numbers
     v = Application.Transpose(tmp)                         ' transpose back
     Erase tmp
  End If
  Const NAMEColumn& = 1, VALUEColumn& = 2                   ' constants for column 1 and 2
' Declare variables
  Dim oChildNode As Object                                  ' late bound node object
  Dim bDisplay   As Boolean
' ---------------------------------------------------------------------
' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
' ---------------------------------------------------------------------
If (oCurrNode.NodeType = 3) Then                                 ' 3 ... NODE_TEXT
  ' display pure text content (NODE_TEXT) of parent elements
    v(i, VALUEColumn) = oCurrNode.Text                           ' nodeValue of text node
  ' return
    listChildNodes = True
ElseIf oCurrNode.NodeType = 1 Then                                ' 1 ... NODE_ELEMENT
   ' --------------------------------------------------------------
   ' B.1 NODE_ELEMENT WITHOUT text node immediately below,
   '     a) e.g. <Details> followed by node element <NAME>,
   '        (i.e. FirstChild.NodeType must not be of type NODE_TEXT = 3)
   '     b) e.g. <College> node element without any child node
   '     Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
   '           (see section A. getting the FirstChild of a NODE_ELEMENT)
   ' --------------------------------------------------------------
   ' a) display parent elements of other element nodes
     If oCurrNode.HasChildNodes Then
         If Not oCurrNode.FirstChild.NodeType = 3 Then             ' <>3 ... not a NODE_TEXT
            bDisplay = True
         End If
   ' b) always display empty node elements
     Else                                                           ' empty NODE_ELEMENT
            bDisplay = True
     End If
     If bDisplay Then
            v(i, NAMEColumn) = String(iLvl * 2, " ") & _
                               iLvl & " " & _
                               oCurrNode.nodename & getAtts(oCurrNode)
            i = i + 1
     End If

   ' --------------------------------------------------------------
   ' B.2 check child nodes
   ' --------------------------------------------------------------
     For Each oChildNode In oCurrNode.ChildNodes
      ' ~~~~~~~~~~~~~~~~~
      ' recursive call <<
      ' ~~~~~~~~~~~~~~~~~
        bDisplay = listChildNodes(oChildNode, v, i, iLvl + 1)

        If bDisplay Then
            v(i, NAMEColumn) = String(iLvl * 2, " ") & _
                               iLvl & " " & _
                               oCurrNode.nodename & getAtts(oCurrNode)
            i = i + 1
        End If
     Next oChildNode
   ' return
     listChildNodes = False

Else    ' just to demonstrate the use of other xml types as e.g. <!-- comments -->
     If oCurrNode.NodeType = 8 Then   ' 8 ... NODE_COMMENT
        v(i, VALUEColumn) = "<!-- " & oCurrNode.NodeValue & "-->"
        i = i + 1
     End If
   ' return
     listChildNodes = False
End If

End Function

'辅助函数getAtts()

由上述函数调用的这个辅助函数返回一个字符串,该字符串枚举给定节点的所有属性名称和值,类似于XPath符号;代码可以很容易地适应您的需要.

This helper function called by the above function returns a string enumerating all attribute names and values of a given node similar to XPath notation; the code can be easily adapted to your needs.

Function getAtts(ByRef node As Object) As String
' Purpose: return attribute(s) string in brackets, e.g. '[@num="123"]'
' Note:    called by above function listChildNodes()
' Author:  T.M.
  Dim sAtts$, ii&
  If node.Attributes.Length > 0 Then
      ii = 0: sAtts = ""
      For ii = 0 To node.Attributes.Length - 1
        sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodename & "=""" & node.Attributes.Item(ii).NodeValue & """]"
      Next ii
  End If
' return
  getAtts = sAtts
End Function

这篇关于使用 VBA 从 XML 获取属性名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆