Excel vba xml解析性能 [英] Excel vba xml parsing performance
问题描述
Dates_ | _Name1_Name2_Name3 _..._ NameX
Date1 |
Date2 |
。 。 。 |
Date1Y |
我有一些代码循环遍历每个单元格,并将数据解析成一个xml字符串,但即使是大约300×300的网格,执行时间也是五分钟左右,而我正在寻找可能需要数千列长的数据集。我已经尝试了一些事情来帮助加快速度,就像将数据读入Variant,然后重复执行或排除DoEvents,但是我无法获得加速。这是一些代码的问题:
Dim lastRow As Long
lRows =(oWorkSheet.Cells(Rows。 Count,1).End(xlUp).Row)
Dim lastColumn As Long
lCols =(oWorkSheet.Cells(1,Columns.Count).End(xlToLeft).Column)
Dim sheet as Variant
带表格(sName)
sheet = .Range(.Cells(1,1),.Cells(lRows,lCols))
结束
ReDim nameCols lCols)As String
...
resultxml =< DataSet>
对于i = 2到行
resultxml = resultxml& < DateRow> 中
对于j = 1到cols
如果Trim(sheet(i,j))<> 然后
lResult =< &安培; nameCols(j)& > 中
rResult =< /& nameCols(j)& > 中
tmpValue = Trim(sheet(i,j))
如果IsDate(tmpValue)而不是IsNumeric(tmpValue)然后
如果Len(tmpValue)> = 8则
tmpValue =格式(tmpValue,yyyy-mm-dd)
如果
结束If If
resultxml = resultxml& lResult& tmpValue& rResult
DoEvents
End If
Next j
resultxml = resultxml& < / DateRow> 中
Next i
resultxml = resultxml& < /数据集> 中
对于获得运行时间的建议将不胜感激。
考虑使用
VBA < (当然符合实际数据)
Sub xmlExport()
错误GoTo ErrHandle
'VBA参考MSXML,v6.0'
Dim doc As New MSXML2.DOMDocument60,xslDoc As New MSXML2.DOMDocument60,newDoc As New MSXML2.DOMDocument60
Dim root为IXMLDOMElement,dataNode为IXMLDOMElement,datesNode为IXMLDOMElement,namesNode为IXML DOMElement
Dim i As Long,j As Long
Dim tmpValue As Variant
'DECLARE XML DOC OBJECT'
设置root = doc.createElement(DataSet)
doc.appendChild root
'ITERATE THROUGH ROWS'
For i = 2 To Sheets(1).UsedRange.Rows.Count
'DATA ROW NODE'
设置dataNode = doc.createElement(DataRow)
root.appendChild dataNode
'DATES NODE'
设置datesNode = doc.createElement(日期)
datesNode.Text = Sheets(1).Range(A& i)
dataNode.appendChild datesNode
'NAMES NODE'
对于j = 1至12
tmpValue =表(1).Cells(i,j + 1 )
如果IsDate(tmpValue)而不是IsNumeric(tmpValue)然后
设置namesNode = doc.createElement(Name& j)
namesNode.Text = Format(tmpValue,yyyy- mm-dd)
dataNode.appendChild namesNode
结束如果
下一个j
下一个i
'PRETTY PRINT RAW OUTPUT'
xslDoc.LoadXML<?xml version =& Chr(34)& 1.0& Chr(34)& >? _
& < xsl:stylesheet version =& Chr(34)& 1.0& Chr(34)_
& xmlns:xsl =& Chr(34)& http://www.w3.org/1999/XSL/Transform& Chr(34)& > 中_
& < xsl:strip-space elements =& Chr(34)& *& Chr(34)& /> _
& < xsl:output method =& Chr(34)& xml& Chr(34)& indent =& Chr(34)& 是& Chr(34)& _
& encoding =& Chr(34)& UTF-8& Chr(34)& /> 中_
& < xsl:template match =& Chr(34)& node()| @ *& Chr(34)& > 中_
& < xsl:copy> _
& < xsl:apply-templates select =& Chr(34)& node()| @ *& Chr(34)& /> _
& < / xsl:copy> _
& < / xsl:template> _
& < / XSL:样式表> 中
xslDoc.async = False
doc.transformNodeToObject xslDoc,newDoc
newDoc.Save ActiveWorkbook.Path& \Output.xml
MsgBox成功导出Excel数据到XML!,vbInformation
退出子
ErrHandle:
MsgBox Err。数字& - & Err.Description,vbCritical
退出Sub
End Sub
输出
<?xml version =1.0encoding =UTF-8?> ;
< DataSet>
< DataRow>
< Dates> Date1< / Dates>
< Name1> ;-05-4-23< / Name1>
< Name2> ________< / Name2>
< Name3> 2016-09-23< / Name3>
< Name4> ;-05-9-24< / Name4>
< Name5> 2016-10-31< / Name5>
< Name6> 2016-09-26< / Name6>
< Name7> ;/20169-27< / Name7>
< Name8> ;-05-9-28< / Name8>
< Name9> ;/20169-29< / Name9>
< Name10> 00469-30< / Name10>
< Name11> 2016-10-01< / Name11>
< Name12> 2016-10-02< / Name12>
< / DataRow>
< DataRow>
<日期> Date2< / Dates>
< Name1> ;-06-6-27< / Name1>
< Name2> 2016-08-8-14< / Name2>
< Name3> ;teenth7-08< / Name3>
< Name4> 08/22< / Name4>
< Name5> 2016-11-03< / Name5>
< Name6> 2016-07-28< / Name6>
< Name7> ;teen8-23< / Name7>
< Name8> 2016-11-01< / Name8>
< Name9> 2016-11-01< / Name9>
< Name10> ;teen8-11< / Name10>
< Name11> ;teenth8-18< / Name11>
< Name12> ;-05-9-23< / Name12>
< / DataRow>
...
I'm working on taking some input data in excel, parsing it to xml and using that to run a SQL stored procedure, but I'm running into performance issue on the xml parsing. The input sheet looks something like this:
Dates_|_Name1_Name2_Name3_..._NameX
Date1 |
Date2 |
. . . |
Date1Y|
I've got some code to loop though each cell and parse out the data into an xml string but even for about a 300 by 300 grid the execution takes something on the order of five minutes and I'm looking to use data sets that could be several thousand columns long. I've tries a couple things to help speed it up like reading the data into a Variant then iterating though that or excluding DoEvents but I haven't been able to get the speed up. Here's the bit of code that's the issue:
Dim lastRow As Long
lRows = (oWorkSheet.Cells(Rows.Count, 1).End(xlUp).Row)
Dim lastColumn As Long
lCols = (oWorkSheet.Cells(1, Columns.Count).End(xlToLeft).Column)
Dim sheet As Variant
With Sheets(sName)
sheet = .Range(.Cells(1, 1), .Cells(lRows, lCols))
End With
ReDim nameCols(lCols) As String
...
resultxml = "<DataSet>"
For i = 2 To rows
resultxml = resultxml & "<DateRow>"
For j = 1 To cols
If Trim(sheet(i, j)) <> "" Then
lResult = "<" & nameCols(j) & ">"
rResult = "</" & nameCols(j) & ">"
tmpValue = Trim(sheet(i, j))
If IsDate(tmpValue) And Not IsNumeric(tmpValue) Then
If Len(tmpValue) >= 8 Then
tmpValue = Format(tmpValue, "yyyy-mm-dd")
End If
End If
resultxml = resultxml & lResult & tmpValue & rResult
DoEvents
End If
Next j
resultxml = resultxml & "</DateRow>"
Next i
resultxml = resultxml & "</DataSet>"
Any advice for getting the run time down would be greatly appreciated.
Consider using MSXML, a comprehensive W3C compliant library of XML APIs which you can use to build your XML with DOM methods (createElement
, appendChild
, setAttribute
) instead of concatenating text strings. XML is not quite a text file but a markup file with encoding and tree structure. Excel comes equipped with the MSXML COM object by reference or late-binding, and can iteratively build a tree from Excel data as shown below.
With 300 rows by 12 cols of random dates, below didn't even take a minute (literally seconds after clicking macro) AND it even pretty printed raw output with line breaks and indentation using an embedded XSLT stylesheet (if you do not pretty print, the MSXML outputs document as one long, continuous line).
Input
VBA (of course align to actual data)
Sub xmlExport()
On Error GoTo ErrHandle
' VBA REFERENCE MSXML, v6.0 '
Dim doc As New MSXML2.DOMDocument60, xslDoc As New MSXML2.DOMDocument60, newDoc As New MSXML2.DOMDocument60
Dim root As IXMLDOMElement, dataNode As IXMLDOMElement, datesNode As IXMLDOMElement, namesNode As IXMLDOMElement
Dim i As Long, j As Long
Dim tmpValue As Variant
' DECLARE XML DOC OBJECT '
Set root = doc.createElement("DataSet")
doc.appendChild root
' ITERATE THROUGH ROWS '
For i = 2 To Sheets(1).UsedRange.Rows.Count
' DATA ROW NODE '
Set dataNode = doc.createElement("DataRow")
root.appendChild dataNode
' DATES NODE '
Set datesNode = doc.createElement("Dates")
datesNode.Text = Sheets(1).Range("A" & i)
dataNode.appendChild datesNode
' NAMES NODE '
For j = 1 To 12
tmpValue = Sheets(1).Cells(i, j + 1)
If IsDate(tmpValue) And Not IsNumeric(tmpValue) Then
Set namesNode = doc.createElement("Name" & j)
namesNode.Text = Format(tmpValue, "yyyy-mm-dd")
dataNode.appendChild namesNode
End If
Next j
Next i
' PRETTY PRINT RAW OUTPUT '
xslDoc.LoadXML "<?xml version=" & Chr(34) & "1.0" & Chr(34) & "?>" _
& "<xsl:stylesheet version=" & Chr(34) & "1.0" & Chr(34) _
& " xmlns:xsl=" & Chr(34) & "http://www.w3.org/1999/XSL/Transform" & Chr(34) & ">" _
& "<xsl:strip-space elements=" & Chr(34) & "*" & Chr(34) & " />" _
& "<xsl:output method=" & Chr(34) & "xml" & Chr(34) & " indent=" & Chr(34) & "yes" & Chr(34) & "" _
& " encoding=" & Chr(34) & "UTF-8" & Chr(34) & "/>" _
& " <xsl:template match=" & Chr(34) & "node() | @*" & Chr(34) & ">" _
& " <xsl:copy>" _
& " <xsl:apply-templates select=" & Chr(34) & "node() | @*" & Chr(34) & " />" _
& " </xsl:copy>" _
& " </xsl:template>" _
& "</xsl:stylesheet>"
xslDoc.async = False
doc.transformNodeToObject xslDoc, newDoc
newDoc.Save ActiveWorkbook.Path & "\Output.xml"
MsgBox "Successfully exported Excel data to XML!", vbInformation
Exit Sub
ErrHandle:
MsgBox Err.Number & " - " & Err.Description, vbCritical
Exit Sub
End Sub
Output
<?xml version="1.0" encoding="UTF-8"?>
<DataSet>
<DataRow>
<Dates>Date1</Dates>
<Name1>2016-04-23</Name1>
<Name2>2016-09-22</Name2>
<Name3>2016-09-23</Name3>
<Name4>2016-09-24</Name4>
<Name5>2016-10-31</Name5>
<Name6>2016-09-26</Name6>
<Name7>2016-09-27</Name7>
<Name8>2016-09-28</Name8>
<Name9>2016-09-29</Name9>
<Name10>2016-09-30</Name10>
<Name11>2016-10-01</Name11>
<Name12>2016-10-02</Name12>
</DataRow>
<DataRow>
<Dates>Date2</Dates>
<Name1>2016-06-27</Name1>
<Name2>2016-08-14</Name2>
<Name3>2016-07-08</Name3>
<Name4>2016-08-22</Name4>
<Name5>2016-11-03</Name5>
<Name6>2016-07-28</Name6>
<Name7>2016-08-23</Name7>
<Name8>2016-11-01</Name8>
<Name9>2016-11-01</Name9>
<Name10>2016-08-11</Name10>
<Name11>2016-08-18</Name11>
<Name12>2016-09-23</Name12>
</DataRow>
...
这篇关于Excel vba xml解析性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!