使用VBA从XML获取属性名称 [英] Obtain attribute names from XML using VBA

查看:149
本文介绍了使用VBA从XML获取属性名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要使用VBA从XML中获得不同的属性名称.

I need to get the distinct attributes names from the XML using VBA.

这是我的代码.

 sub test() 
 Dim XMLFile As Object
Dim XMLFileName As String
Set XMLFile = CreateObject("Microsoft.XMLDOM")

XMLFileName = "C:\Users\Input.xml"
XMLFile.async = False
XMLFile.Load (XMLFileName)
XMLFile.validateOnParse = False

Dim mainnode As Object
Dim node As Object

Set mainnode = XMLFile.SelectNodes("//Elements")

For Each node In mainnode
    For Each child In node.ChildNodes
    Debug.Print child.BaseName
    Dim kiddo As Object
    For Each kiddo In child.ChildNodes
        Debug.Print kiddo.BaseName
    Next kiddo
Next child
Next node
End sub

这是示例XML.我需要从XML中获取属性名称num.

Here is the sample XML. I need to get the attribute name num from the XML.

<Elements>
<Details>
    <Name>ABC</Name>
    <Address>123ABC</Address>
    <College>
        <collname>safasf</collname>
         <collnumber/>
    </College>
</Details>  
<Dept num="123">
    <Deptname>IT</Deptname>
    <ID>A123</ID>
 </Dept>            
</Elements>

预期结果:

 Elements
 Details
 Name 
 Address
 College
 collname
 collnumber
 Dept
 num
 Deptname
 ID

以上代码的实际结果:

 Elements
 Details
 Name 
 Address
 College
 collname
 Dept
 Deptname
 ID

我的代码未获取"num"属性和<collnumber/>标记.有人可以让我知道如何使用VBA从XML中获取属性名称和标记名称

The "num" attribute and <collnumber/> tag is not fetched by my code. Could someone let me know how to fetch the attribute names along with the tag names from XML using VBA

推荐答案

通过递归函数调用显示包含属性的XML结构

我的示例代码演示了一种方法

My example code demonstrates a way to

  • [1]使用XMLDOM方法和
  • 将整个XML结构分配给2维数组
  • [2](可选)将其写回到工作表中.
  • [1] assign the entire XML structure to a 2-dim array using XMLDOM methods and
  • [2] optionally write it back to a sheet.

放大提示:

我添加了这些►结构化提示,以提供比仅显示代码更多的帮助,因为我说过,这些要点中的很多也导致其他用户重复提问:

I added these ► structured hints to offer more help than by displaying code only, as I remarked that many of these points lead to repeated questions by other users, too:

  • 尝试列出XML个结构,但随着节点元素层次结构深度的增加(类型常数1 NODE_ELEMENT),您会失去良好的视野,因此,我紧急建议使用►递归调用在此示例代码中.
  • 此外,您可能还没有考虑到节点文本(类型常量3 NODE_TEXT)的特殊构造是给定父元素名称的第一个孩子 -cf主功能listChildNodes中A.和B.部分. 您遍历子节点的循环不会区分所提到的类型.只需研究引用函数中的注释以了解详细信息即可.
  • 我想您的XML文件以所需的处理指令开头,例如<?xml version="1.0" encoding="utf-8"?>,以便可以将其实际识别为XML文件.
  • 调用过程DisplayXML()使用 late绑定,而不是类似于您的帖子的早期绑定对MS XML的引用,但是使用推荐的 MSXML2版本6.0 .它通过DocumentElement <Elements>( BTW是单个节点元素)和引用预定义2维数组v的第二个参数来调用main函数.
  • 版本::如果您要使用Set XDoc = CreateObject("MSXML2.DOMDocument")XMLFILE对象设置为内存,则通常会获得较旧的默认版本(3.0),因此在大多数情况下,最好显式使用代替(自动包含XPath).
  • 如果不使用 Load 函数取回True(文件成功加载)或False(加载错误),则无需设置文件放在方括号()中.
  • 搜索字符串中的XPath运算符//将返回 any 级别的所有匹配项(例如,OP中的XMLFile.SelectNodes("//Elements")).
  • 还考虑使用 XSLT ,这是一种专用语言,旨在将XML文件转换为各种最终用途格式.
  • Trying to list XML structures you lose good view with increasing hierarchy depth of your node elements (type constant 1 NODE_ELEMENT), so I urgently recommend the use of ► recursive calls as used in this example code.
  • Furthermore you might have not considered the special construction of node text (type constant 3 NODE_TEXT) being the first child of a name giving parent element - c.f. sections A. and B. in main function listChildNodes. Your loops through child nodes would not distinguish between the mentioned types. Just study the comments in the cited function for details.
  • I suppose your XML file starts with a needed processing instruction like e.g. <?xml version="1.0" encoding="utf-8"?>, so that it can be actually identified as XML file.
  • The calling procedure DisplayXML() uses late binding instead of early bound reference to MS XML similar to your post, but uses the recommended MSXML2 version 6.0. It calls the main function via its DocumentElement <Elements> (BTW a single node element) and a second argument referring to a predefined 2-dim array v.
  • Versioning: If you would set your XMLFILE object to memory with Set XDoc = CreateObject("MSXML2.DOMDocument") generally you are getting the older default Version (3.0), so in most cases it's preferrable to use explicitly Set XDoc = CreateObject("MSXML2.DOMDocument.6.0") instead (including XPath automatically).
  • If you don't use the Load function to get a True (file loaded successfully) or False (load error) back, it is not necessary to set the file name into brackets ().
  • The XPath operator // in search strings would return any occurences at any level (c.f. XMLFile.SelectNodes("//Elements") in your OP).
  • Consider also the use of XSLT, a special-purpose language designed to tranform XML files into all kind of end-use formats.

调用过程DisplayXML

Calling procedure DisplayXML

提示:只需在调用过程中用估计的项数(例如1000)来估计数组的行数就足够了,因为主函数执行ReDim(包括双重换位)(如果需要).尽管如此,我还是从一开始就通过XPath/XMLDOM表达式XMLFile.SelectNodes("//*").Length在整个文件中对任何项目进行计数来添加确切的项目计数.

Hint: It would sufficient to dimension the array 's row count only with an estimated number of items in the calling procedure (e.g. 1000), as the main function executes a ReDim (including a double transposition) automatically if needed. Nevertheless I added the exact items count here from the start via XPath/XMLDOM expression XMLFile.SelectNodes("//*").Length counting any item in the entire file.

Option Explicit          ' declaration head of your code module

Sub DisplayXML()
Dim XMLFile As Object
Dim XMLFileName As String
'Set XMLFile = CreateObject("Microsoft.XMLDOM")   ' former style not recommended
Set XMLFile = CreateObject("MSXML2.DOMDocument.6.0")

XMLFileName = "C:\Users\Input.xml"                             ' << change to your xml file name
XMLFile.Async = False
XMLFile.ValidateOnParse = False
Debug.Print XMLFile.XML

If XMLFile.Load(XMLFileName) Then
' [1] write xml info to array with exact or assumed items count
  Dim v As Variant: ReDim v(1 To XMLFile.SelectNodes("//*").Length, 1 To 2)
  listChildNodes XMLFile.DocumentElement, v                 ' call helper function

' [2] write results to sheet "Dump"                         ' change to your sheet name
  With ThisWorkbook.Worksheets("Dump")
       .Range("A:B") = ""                                   ' clear result range
       .Range("A1:B1") = Split("XML Tag,Node Value", ",")   ' titles
       .Range("A2").Resize(UBound(v), UBound(v, 2)) = v     ' get  2-dim info array
  End With
Else
       MsgBox "Load Error " & XMLFileName
End If
Set XMLFile = Nothing
End Sub

结构化结果显示在工作表中

提示:如果您不希望级别缩进或枚举级别层次结构,则可以轻松地调整下面的主要功能listChildNodes().

Hint: If you don't want the level indentation or enumerated Level hierarchy, you can easily adapt the main function listChildNodes() below.

+----+---------------------+-----------------+
|    |         A           |       B         |
+----+---------------------+-----------------+
|1   | XML Tag             | Node Value      |
+----+---------------------+-----------------+
|2   | 0 Elements          |                 |
+----+---------------------+-----------------+
|3   |   1 Details         |                 |
+----+---------------------+-----------------+
|4   |     2 Name          | ABC             |
+----+---------------------+-----------------+
|5   |     2 Address       | 123ABC          |
+----+---------------------+-----------------+
|6   |     2 College       |                 |
+----+---------------------+-----------------+
|7   |       3 collname    | safasf          |
+----+---------------------+-----------------+
|8   |       3 collnumber  |                 |
+----+---------------------+-----------------+
|9   |   1 Dept[@num="123"]|                 |
+----+---------------------+-----------------+
|10  |     2 Deptname      | IT              |
+----+---------------------+-----------------+
|11  |     2 ID            | A123            |
+----+---------------------+-----------------+

还可以引用精确的节点元素,例如通过

It is also possible to refer to a precise node element, e.g. via

listChildNodes XMLFile.DocumentElement.SelectSingleNode("Dept[@num=""123""]"),v, 1, 1       ' starting from item no 1 and Level no 1

这将单独列出指示的节点集:

This would list the indicated node set alone:

+----+---------------------+-----------------+
|    |         A           |       B         |
+----+---------------------+-----------------+
|1   | XML Tag             | Node Value      |
+----+---------------------+-----------------+
|2   |   1 Dept[@num="123"]|                 |
+----+---------------------+-----------------+
|3   |     2 Deptname      | IT              |
+----+---------------------+-----------------+
|4   |     2 ID            | A123            |
+----+---------------------+-----------------+

递归主要功能listChildNodes()

Recursive main function listChildNodes()

遍历子节点集合,此函数反复(递归")调用自身(即当前节点对象),并将整个XML结构分配给给定的2-dim数组(第二个参数).此外,它还允许使用并指示层次结构级别. 请注意,此示例中的数组必须基于1.

Looping through childnode collections this function calls itself (i.e. the current node object) repeatedly ("recursively") and assigns the entire XML structure to a given 2-dim array (2nd argument). Furthermore it allows indendation and indicates the hierarchy levels. Note that the array in this example has to be 1-based.

Edit 20/8 2018包括自动增加阵列大小. 技术说明:由于这样的ReDim在较小(此处为第1个)维度上是不可能的,因此需要将行"(第1维)更改为列"(第2维)的中间换位.

Edit 20/8 2018 includes an automatic increase of array size if the items counter i exceeds the current array's upper boundary (UBound(v), i.e. in its first dimension = items count). Technical note: As such a ReDim isn't possible in a minor (here 1st) dimension, an intermediate transposition changing 'rows' (dim 1) to 'columns' (dim 2) is necessary.

Function listChildNodes(oCurrNode As Object, _
                        ByRef v As Variant, _
                        Optional ByRef i As Long = 1, _
                        Optional iLvl As Integer = 0 _
                        ) As Boolean
' Purpose: assign the complete node structure with contents to a 1-based 2-dim array
' Author:  T.M.
' Note: Late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
'       (1 ... NODE_ELEMENT, 2 ... NODE_ATTRIBUTE, 3 ... NODE_TEXT etc.)
' Escape
  If oCurrNode Is Nothing Then Exit Function
  If i < 1 Then i = 1                                       ' one based items Counter
' Edit 20/8 2018 - Automatic increase of array size if needed 
  If i >= UBound(v) Then                                    ' change array size if needed
     Dim tmp As Variant
     tmp = Application.Transpose(v)                         ' change rows to columns
     ReDim Preserve tmp(1 To 2, 1 To UBound(v) + 1000)      ' increase row numbers
     v = Application.Transpose(tmp)                         ' transpose back
     Erase tmp
  End If
  Const NAMEColumn& = 1, VALUEColumn& = 2                   ' constants for column 1 and 2
' Declare variables
  Dim oChildNode As Object                                  ' late bound node object
  Dim bDisplay   As Boolean
' ---------------------------------------------------------------------
' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
' ---------------------------------------------------------------------
If (oCurrNode.NodeType = 3) Then                                 ' 3 ... NODE_TEXT
  ' display pure text content (NODE_TEXT) of parent elements
    v(i, VALUEColumn) = oCurrNode.Text                           ' nodeValue of text node
  ' return
    listChildNodes = True
ElseIf oCurrNode.NodeType = 1 Then                                ' 1 ... NODE_ELEMENT
   ' --------------------------------------------------------------
   ' B.1 NODE_ELEMENT WITHOUT text node immediately below,
   '     a) e.g. <Details> followed by node element <NAME>,
   '        (i.e. FirstChild.NodeType must not be of type NODE_TEXT = 3)
   '     b) e.g. <College> node element without any child node
   '     Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
   '           (see section A. getting the FirstChild of a NODE_ELEMENT)
   ' --------------------------------------------------------------
   ' a) display parent elements of other element nodes
     If oCurrNode.HasChildNodes Then
         If Not oCurrNode.FirstChild.NodeType = 3 Then             ' <>3 ... not a NODE_TEXT
            bDisplay = True
         End If
   ' b) always display empty node elements
     Else                                                           ' empty NODE_ELEMENT
            bDisplay = True
     End If
     If bDisplay Then
            v(i, NAMEColumn) = String(iLvl * 2, " ") & _
                               iLvl & " " & _
                               oCurrNode.nodename & getAtts(oCurrNode)
            i = i + 1
     End If

   ' --------------------------------------------------------------
   ' B.2 check child nodes
   ' --------------------------------------------------------------
     For Each oChildNode In oCurrNode.ChildNodes
      ' ~~~~~~~~~~~~~~~~~
      ' recursive call <<
      ' ~~~~~~~~~~~~~~~~~
        bDisplay = listChildNodes(oChildNode, v, i, iLvl + 1)

        If bDisplay Then
            v(i, NAMEColumn) = String(iLvl * 2, " ") & _
                               iLvl & " " & _
                               oCurrNode.nodename & getAtts(oCurrNode)
            i = i + 1
        End If
     Next oChildNode
   ' return
     listChildNodes = False

Else    ' just to demonstrate the use of other xml types as e.g. <!-- comments -->
     If oCurrNode.NodeType = 8 Then   ' 8 ... NODE_COMMENT
        v(i, VALUEColumn) = "<!-- " & oCurrNode.NodeValue & "-->"
        i = i + 1
     End If
   ' return
     listChildNodes = False
End If

End Function

'助手功能getAtts()

'Helper function getAtts()

由上述函数调用的此辅助函数返回一个字符串,该字符串枚举给定节点的所有属性名称和值,类似于XPath表示法;该代码可以轻松地适应您的需求.

This helper function called by the above function returns a string enumerating all attribute names and values of a given node similar to XPath notation; the code can be easily adapted to your needs.

Function getAtts(ByRef node As Object) As String
' Purpose: return attribute(s) string in brackets, e.g. '[@num="123"]'
' Note:    called by above function listChildNodes()
' Author:  T.M.
  Dim sAtts$, ii&
  If node.Attributes.Length > 0 Then
      ii = 0: sAtts = ""
      For ii = 0 To node.Attributes.Length - 1
        sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodename & "=""" & node.Attributes.Item(ii).NodeValue & """]"
      Next ii
  End If
' return
  getAtts = sAtts
End Function

这篇关于使用VBA从XML获取属性名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆