Excel vba xml解析性能 [英] Excel vba xml parsing performance

查看:182
本文介绍了Excel vba xml解析性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用excel中的一些输入数据,将其解析为xml并使用它来运行SQL存储过程,但是我在xml解析中遇到性能问题。输入表如下所示:

  Dates_ | _Name1_Name2_Name3 _..._ NameX 
Date1 |
Date2 |
。 。 。 |
Date1Y |

我有一些代码循环遍历每个单元格,并将数据解析成一个xml字符串,但即使是大约300×300的网格,执行时间也是五分钟左右,而我正在寻找可能需要数千列长的数据集。我已经尝试了一些事情来帮助加快速度,就像将数据读入Variant,然后重复执行或排除DoEvents,但是我无法获得加速。这是一些代码的问题:

  Dim lastRow As Long 
lRows =(oWorkSheet.Cells(Rows。 Count,1).End(xlUp).Row)
Dim lastColumn As Long
lCols =(oWorkSheet.Cells(1,Columns.Count).End(xlToLeft).Column)
Dim sheet as Variant
带表格(sName)
sheet = .Range(.Cells(1,1),.Cells(lRows,lCols))
结束
ReDim nameCols lCols)As String

...

  resultxml =< DataSet> 
对于i = 2到行
resultxml = resultxml& < DateRow> 中

对于j = 1到cols
如果Trim(sheet(i,j))<> 然后
lResult =< &安培; nameCols(j)& > 中
rResult =< /& nameCols(j)& > 中
tmpValue = Trim(sheet(i,j))
如果IsDate(tmpValue)而不是IsNumeric(tmpValue)然后
如果Len(tmpValue)> = 8则
tmpValue =格式(tmpValue,yyyy-mm-dd)
如果
结束If If
resultxml = resultxml& lResult& tmpValue& rResult
DoEvents
End If
Next j
resultxml = resultxml& < / DateRow> 中
Next i

resultxml = resultxml& < /数据集> 中

对于获得运行时间的建议将不胜感激。

解决方案

考虑使用



VBA < (当然符合实际数据)

  Sub xmlExport()
错误GoTo ErrHandle
'VBA参考MSXML,v6.0'
Dim doc As New MSXML2.DOMDocument60,xslDoc As New MSXML2.DOMDocument60,newDoc As New MSXML2.DOMDocument60
Dim root为IXMLDOMElement,dataNode为IXMLDOMElement,datesNode为IXMLDOMElement,namesNode为IXML DOMElement
Dim i As Long,j As Long
Dim tmpValue As Variant

'DECLARE XML DOC OBJECT'
设置root = doc.createElement(DataSet)
doc.appendChild root

'ITERATE THROUGH ROWS'
For i = 2 To Sheets(1).UsedRange.Rows.Count

'DATA ROW NODE'
设置dataNode = doc.createElement(DataRow)
root.appendChild dataNode

'DATES NODE'
设置datesNode = doc.createElement(日期)
datesNode.Text = Sheets(1).Range(A& i)
dataNode.appendChild datesNode

'NAMES NODE'
对于j = 1至12
tmpValue =表(1).Cells(i,j + 1 )
如果IsDate(tmpValue)而不是IsNumeric(tmpValue)然后
设置namesNode = doc.createElement(Name& j)
namesNode.Text = Format(tmpValue,yyyy- mm-dd)
dataNode.appendChild namesNode
结束如果
下一个j

下一个i

'PRETTY PRINT RAW OUTPUT'
xslDoc.LoadXML<?xml version =& Chr(34)& 1.0& Chr(34)& >? _
& < xsl:stylesheet version =& Chr(34)& 1.0& Chr(34)_
& xmlns:xsl =& Chr(34)& http://www.w3.org/1999/XSL/Transform& Chr(34)& > 中_
& < xsl:strip-space elements =& Chr(34)& *& Chr(34)& /> _
& < xsl:output method =& Chr(34)& xml& Chr(34)& indent =& Chr(34)& 是& Chr(34)& _
& encoding =& Chr(34)& UTF-8& Chr(34)& /> 中_
& < xsl:template match =& Chr(34)& node()| @ *& Chr(34)& > 中_
& < xsl:copy> _
& < xsl:apply-templates select =& Chr(34)& node()| @ *& Chr(34)& /> _
& < / xsl:copy> _
& < / xsl:template> _
& < / XSL:样式表> 中

xslDoc.async = False
doc.transformNodeToObject xslDoc,newDoc
newDoc.Save ActiveWorkbook.Path& \Output.xml

MsgBox成功导出Excel数据到XML!,vbInformation
退出子

ErrHandle:
MsgBox Err。数字& - & Err.Description,vbCritical
退出Sub

End Sub

输出

 <?xml version =1.0encoding =UTF-8?> ; 
< DataSet>
< DataRow>
< Dates> Date1< / Dates>
< Name1> ;-05-4-23< / Name1>
< Name2> ________< / Name2>
< Name3> 2016-09-23< / Name3>
< Name4> ;-05-9-24< / Name4>
< Name5> 2016-10-31< / Name5>
< Name6> 2016-09-26< / Name6>
< Name7> ;/20169-27< / Name7>
< Name8> ;-05-9-28< / Name8>
< Name9> ;/20169-29< / Name9>
< Name10> 00469-30< / Name10>
< Name11> 2016-10-01< / Name11>
< Name12> 2016-10-02< / Name12>
< / DataRow>
< DataRow>
<日期> Date2< / Dates>
< Name1> ;-06-6-27< / Name1>
< Name2> 2016-08-8-14< / Name2>
< Name3> ;teenth7-08< / Name3>
< Name4> 08/22< / Name4>
< Name5> 2016-11-03< / Name5>
< Name6> 2016-07-28< / Name6>
< Name7> ;teen8-23< / Name7>
< Name8> 2016-11-01< / Name8>
< Name9> 2016-11-01< / Name9>
< Name10> ;teen8-11< / Name10>
< Name11> ;teenth8-18< / Name11>
< Name12> ;-05-9-23< / Name12>
< / DataRow>
...


I'm working on taking some input data in excel, parsing it to xml and using that to run a SQL stored procedure, but I'm running into performance issue on the xml parsing. The input sheet looks something like this:

Dates_|_Name1_Name2_Name3_..._NameX
Date1 |
Date2 |
. . . |
Date1Y|

I've got some code to loop though each cell and parse out the data into an xml string but even for about a 300 by 300 grid the execution takes something on the order of five minutes and I'm looking to use data sets that could be several thousand columns long. I've tries a couple things to help speed it up like reading the data into a Variant then iterating though that or excluding DoEvents but I haven't been able to get the speed up. Here's the bit of code that's the issue:

Dim lastRow As Long
lRows = (oWorkSheet.Cells(Rows.Count, 1).End(xlUp).Row)
Dim lastColumn As Long
lCols = (oWorkSheet.Cells(1, Columns.Count).End(xlToLeft).Column)
Dim sheet As Variant
With Sheets(sName)
  sheet = .Range(.Cells(1, 1), .Cells(lRows, lCols))
End With
ReDim nameCols(lCols) As String

...

resultxml = "<DataSet>"
For i = 2 To rows
    resultxml = resultxml & "<DateRow>"

    For j = 1 To cols
        If Trim(sheet(i, j)) <> "" Then
            lResult = "<" & nameCols(j) & ">"
            rResult = "</" & nameCols(j) & ">"
            tmpValue = Trim(sheet(i, j))
            If IsDate(tmpValue) And Not IsNumeric(tmpValue) Then
                If Len(tmpValue) >= 8 Then
                    tmpValue = Format(tmpValue, "yyyy-mm-dd")
                End If
            End If
            resultxml = resultxml & lResult & tmpValue & rResult
            DoEvents
        End If
    Next j
    resultxml = resultxml & "</DateRow>"
Next i

resultxml = resultxml & "</DataSet>"

Any advice for getting the run time down would be greatly appreciated.

解决方案

Consider using MSXML, a comprehensive W3C compliant library of XML APIs which you can use to build your XML with DOM methods (createElement, appendChild, setAttribute) instead of concatenating text strings. XML is not quite a text file but a markup file with encoding and tree structure. Excel comes equipped with the MSXML COM object by reference or late-binding, and can iteratively build a tree from Excel data as shown below.

With 300 rows by 12 cols of random dates, below didn't even take a minute (literally seconds after clicking macro) AND it even pretty printed raw output with line breaks and indentation using an embedded XSLT stylesheet (if you do not pretty print, the MSXML outputs document as one long, continuous line).

Input

VBA (of course align to actual data)

Sub xmlExport()
On Error GoTo ErrHandle
    ' VBA REFERENCE MSXML, v6.0 '
    Dim doc As New MSXML2.DOMDocument60, xslDoc As New MSXML2.DOMDocument60, newDoc As New MSXML2.DOMDocument60
    Dim root As IXMLDOMElement, dataNode As IXMLDOMElement, datesNode As IXMLDOMElement, namesNode As IXMLDOMElement
    Dim i As Long, j As Long
    Dim tmpValue As Variant

    ' DECLARE XML DOC OBJECT '
    Set root = doc.createElement("DataSet")
    doc.appendChild root

    ' ITERATE THROUGH ROWS '
    For i = 2 To Sheets(1).UsedRange.Rows.Count

        ' DATA ROW NODE '
        Set dataNode = doc.createElement("DataRow")
        root.appendChild dataNode

        ' DATES NODE '
        Set datesNode = doc.createElement("Dates")
        datesNode.Text = Sheets(1).Range("A" & i)
        dataNode.appendChild datesNode

        ' NAMES NODE '
        For j = 1 To 12
            tmpValue = Sheets(1).Cells(i, j + 1)
            If IsDate(tmpValue) And Not IsNumeric(tmpValue) Then
                Set namesNode = doc.createElement("Name" & j)
                namesNode.Text = Format(tmpValue, "yyyy-mm-dd")
                dataNode.appendChild namesNode
            End If
        Next j

    Next i

    ' PRETTY PRINT RAW OUTPUT '
    xslDoc.LoadXML "<?xml version=" & Chr(34) & "1.0" & Chr(34) & "?>" _
            & "<xsl:stylesheet version=" & Chr(34) & "1.0" & Chr(34) _
            & "                xmlns:xsl=" & Chr(34) & "http://www.w3.org/1999/XSL/Transform" & Chr(34) & ">" _
            & "<xsl:strip-space elements=" & Chr(34) & "*" & Chr(34) & " />" _
            & "<xsl:output method=" & Chr(34) & "xml" & Chr(34) & " indent=" & Chr(34) & "yes" & Chr(34) & "" _
            & "            encoding=" & Chr(34) & "UTF-8" & Chr(34) & "/>" _
            & " <xsl:template match=" & Chr(34) & "node() | @*" & Chr(34) & ">" _
            & "  <xsl:copy>" _
            & "   <xsl:apply-templates select=" & Chr(34) & "node() | @*" & Chr(34) & " />" _
            & "  </xsl:copy>" _
            & " </xsl:template>" _
            & "</xsl:stylesheet>"

    xslDoc.async = False
    doc.transformNodeToObject xslDoc, newDoc
    newDoc.Save ActiveWorkbook.Path & "\Output.xml"

    MsgBox "Successfully exported Excel data to XML!", vbInformation
    Exit Sub

ErrHandle:
    MsgBox Err.Number & " - " & Err.Description, vbCritical
    Exit Sub

End Sub

Output

<?xml version="1.0" encoding="UTF-8"?>
<DataSet>
    <DataRow>
        <Dates>Date1</Dates>
        <Name1>2016-04-23</Name1>
        <Name2>2016-09-22</Name2>
        <Name3>2016-09-23</Name3>
        <Name4>2016-09-24</Name4>
        <Name5>2016-10-31</Name5>
        <Name6>2016-09-26</Name6>
        <Name7>2016-09-27</Name7>
        <Name8>2016-09-28</Name8>
        <Name9>2016-09-29</Name9>
        <Name10>2016-09-30</Name10>
        <Name11>2016-10-01</Name11>
        <Name12>2016-10-02</Name12>
    </DataRow>
    <DataRow>
        <Dates>Date2</Dates>
        <Name1>2016-06-27</Name1>
        <Name2>2016-08-14</Name2>
        <Name3>2016-07-08</Name3>
        <Name4>2016-08-22</Name4>
        <Name5>2016-11-03</Name5>
        <Name6>2016-07-28</Name6>
        <Name7>2016-08-23</Name7>
        <Name8>2016-11-01</Name8>
        <Name9>2016-11-01</Name9>
        <Name10>2016-08-11</Name10>
        <Name11>2016-08-18</Name11>
        <Name12>2016-09-23</Name12>
    </DataRow>
    ...

这篇关于Excel vba xml解析性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆