抓取XML时,VB.NET For Loop速度太慢 [英] VB.NET For Loop too slow when grabbing XML

查看:138
本文介绍了抓取XML时,VB.NET For Loop速度太慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从XML文件(资源文件)获取值,并将其插入到数据表中。我有679个键从资源文件获取,这需要3.41秒。我想知道是否有任何方法使这个循环更快。

我已经试过了Parallel.For循环,但是我发现它是不稳定的,因为它在前面的插入没有完成时开始插入一行。我使用同步块,但速度回到3.41。

 对于idx As Integer = 0到KeyNames.Length  -  1 
With KeyManagerResource.Instance
DataTableManager.Instance.InsertRow(KeyNames(idx),.GetKeyValue(KeyNames(idx),DynamicProperties.Instance.EnglishResourcePath),_
.GetKeyValue(KeyNames(idx) ,DynamicProperties.Instance.FrenchResourcePath))
End With
Next
'''< summary>
'''获取密钥的值。
'''< / summary>
'''< param name =ID>密钥的ID< / param>
'''<返回>密钥的值。< / returns>
'''<备注>< /备注>
覆盖函数GetKeyValue(ID作为字符串,文件作为字符串)作为字符串

'将XMLReader的当前路径设置为英文文件。
XMLManager.Instance.SetReaderPath(File)

Dim returnedNode As Xml.XmlNode = XMLManager.Instance.GetNode(String.Format(// data& Helper.CaseInsensitiveSearch(name (1).InnerText
else
返回
结束如果

结束功能

'''< summary>
'''向目标表添加一行。
'''< / summary>
'''< param name =RowValues>我们要插入的行值。这些都是按顺序排列的,因此假定数组中的第一行值是目标数据表的第一列
''。< / param>
'''<备注>< /备注>
Public Sub InsertRow(ByVal ParamArray RowValues()As String)

'如果RowValues的长度与列不相等,则表示这是无效的插入。抛出异常。
如果RowValues.Length = dtTargetTable.Columns.Count然后

'创建一个新行。
Dim drNewRow As DataRow
drNewRow = dtTargetTable.NewRow

'查看行值。
对于idx整数= 0到RowValues.Length - 1

存储列的值。
drNewRow(dtTargetTable.Columns(idx))= RowValues(idx)

下一个

'只有当主键不存在时才添加键。
如果dtTargetTable.Rows.Find(RowValues(0))是Nothing Then
'将行添加到表中。
dtTargetTable.Rows.InsertAt(drNewRow,0)
End If

else
抛出新异常(String.Format(Invalid insert。 & _
目标数据表的列数为{0}。,dtTargetTable.Columns.Count))
结束如果

End Sub


解决方案

我有几个建议,应该有所帮助:

不要使用索引多次检索键名;使用For Each循环可以减少处理量。

不要在每个循环中交换XML文件;相反,初始化循环外的实例并将它们传递给适当的方法(不确定实例类型是什么,所以我创建了一个名为XMLManagerInstance的实例)。

在每个循环的数据表中检查主键是否存在。相反,如果PK已经存在,请保留以前使用的主键的列表,并且不要打扰任何工作。



这些应该有助于提高性能相当多,特别是最后两个。



下面是代码的一个建议重做:

  Dim KeyNames As List(Of String)

Dim cPrimaryKeys As New System.Collections.Generic.HashSet(Of String)
Dim oEnglishFile As XMLManagerInstance
Dim oFrenchFile As XMLManagerInstance

oEnglishFile.SetReaderPath(DynamicProperties.Instance.EnglishResourcePath)
oFrenchFile.SetReaderPath(DynamicProperties.Instance.FrenchResourcePath)

对于每个KeyName As String In KeyNames
如果不是cPrimaryKeys.Contains(KeyName)那么
cPrimaryKeys.Add(KeyName)
With KeyManagerResource.Instance
DataTableManager.Instance.InsertRow(KeyName,.GetKeyValue(KeyName, oEnglishFile),.GetKeyValue(KeyName,oFrenchFile))
End With
End If
Next

'''< summary>
'''获取密钥的值。
'''< / summary>
'''< param name =ID>密钥的ID< / param>
'''<返回>密钥的值。< / returns>
'''<备注>< /备注>
Public Function GetKeyValue(ID As String,FileInstance As XMLManagerInstance)As String
$ b $ Dim returnedNode As Xml.XmlNode = FileInstance.GetNode(String.Format(// data& Helper。 CaseInsensitiveSearch(name),'& ID.ToLower&'))

如果returnedNode IsNot Nothing则
返回returnedNode.ChildNodes(1).InnerText
其他
返回
结束如果

结束功能

'''< summary>
'''向目标表添加一行。
'''< / summary>
'''< param name =RowValues>我们要插入的行值。这些都是按顺序排列的,因此假定数组中的第一行值是目标数据表的第一列
''。< / param>
'''<备注>< /备注>
Public Sub InsertRow(ByVal ParamArray RowValues()As String)

'如果RowValues的长度与列不相等,则表示这是无效的插入。抛出异常。
如果RowValues.Length = dtTargetTable.Columns.Count然后
'创建一个新行。
Dim drNewRow As DataRow

drNewRow = dtTargetTable.NewRow

'查看行值。
对于idx整数= 0到RowValues.Length - 1
存储列的值。
drNewRow(dtTargetTable.Columns(idx))= RowValues(idx)
Next
else
抛出新异常(String.Format(Invalid insert。 & _
目标dataTable的列数是{0}。,dtTargetTable.Columns.Count))
End If

结束Sub


I am getting values from an XML file (a resources file), and basically inserting them into a datatable. I have 679 keys to get from the resource file, and this takes 3.41 seconds. I was wondering if there is any way of making this loop faster.

I have tried the Parallel.For loop, but I found that it is unstable because it begins inserting a row when the previous insert didn't finish. I used synch block, but then the speed went back to 3.41.

 For idx As Integer = 0 To KeyNames.Length - 1
        With KeyManagerResource.Instance
            DataTableManager.Instance.InsertRow(KeyNames(idx), .GetKeyValue(KeyNames(idx), DynamicProperties.Instance.EnglishResourcePath), _
                                                               .GetKeyValue(KeyNames(idx), DynamicProperties.Instance.FrenchResourcePath))
        End With
    Next
 ''' <summary>
''' Gets the value of the key.
''' </summary>
''' <param name="ID">ID of the key.</param>
''' <returns>Value of the key.</returns>
''' <remarks></remarks>
Overrides Function GetKeyValue(ID As String, File As String) As String

   'Sets the current path of the XMLReader to the english file.
    XMLManager.Instance.SetReaderPath(File)

    Dim returnedNode As Xml.XmlNode = XMLManager.Instance.GetNode(String.Format("//data" & Helper.CaseInsensitiveSearch("name"), "'" & ID.ToLower & "'"))

    If returnedNode IsNot Nothing Then
        Return returnedNode.ChildNodes(1).InnerText
    Else
        Return ""
    End If

End Function

 ''' <summary>
''' Adds a row to the target table.
''' </summary>
''' <param name="RowValues">The row values we want to insert. These are in order, so it is presumed the first row value in the array is for the first column 
''' of the target data table.</param>
''' <remarks></remarks>
Public Sub InsertRow(ByVal ParamArray RowValues() As String)

    'If the length of the RowValues is not equal the columns, that means that is an invalid insert. Throw exception.
    If RowValues.Length = dtTargetTable.Columns.Count Then

        'Creates a new row.
        Dim drNewRow As DataRow
        drNewRow = dtTargetTable.NewRow

        'Goes through the row values.
        For idx As Integer = 0 To RowValues.Length - 1

            'Store the value for the column.
            drNewRow(dtTargetTable.Columns(idx)) = RowValues(idx)

        Next

        'Only adds the key if the primary key doesn't already exist.
        If dtTargetTable.Rows.Find(RowValues(0)) Is Nothing Then
            'Adds the row to the table.
            dtTargetTable.Rows.InsertAt(drNewRow, 0)
        End If

    Else
        Throw New Exception(String.Format("Invalid insert. The number of row values passed are not equal to the number of columns of the target dataTable." & _
                                          "The number of columns of the target dataTable are {0}.", dtTargetTable.Columns.Count))
    End If

End Sub

解决方案

I have several suggestions that should help:

Don't retrieve the keyname using index multiple times; using a For Each loop reduces that amount of processing.

Don't swap XML files in each loop; instead, initialize instances outside of the loop and pass them to the appropriate method (not sure what the instance type is, so I made one up called XMLManagerInstance).

Don't check for primary key existence in the datatable on each loop. Instead, keep a list of the previously used primary keys in the outer loop and don't bother doing any work if the PK already exists.

These should help improve your performance quite a bit, especially the last two.

Here is a suggested reworking of the code:

    Dim KeyNames As List(Of String)

    Dim cPrimaryKeys As New System.Collections.Generic.HashSet(Of String)
    Dim oEnglishFile As XMLManagerInstance
    Dim oFrenchFile As XMLManagerInstance

    oEnglishFile.SetReaderPath(DynamicProperties.Instance.EnglishResourcePath)
    oFrenchFile.SetReaderPath(DynamicProperties.Instance.FrenchResourcePath)

    For Each KeyName As String In KeyNames
        If Not cPrimaryKeys.Contains(KeyName) Then
            cPrimaryKeys.Add(KeyName)
            With KeyManagerResource.Instance
                DataTableManager.Instance.InsertRow(KeyName, .GetKeyValue(KeyName, oEnglishFile), .GetKeyValue(KeyName, oFrenchFile))
            End With
        End If
    Next

''' <summary>
''' Gets the value of the key.
''' </summary>
''' <param name="ID">ID of the key.</param>
''' <returns>Value of the key.</returns>
''' <remarks></remarks>
Public Function GetKeyValue(ID As String, FileInstance As XMLManagerInstance) As String

    Dim returnedNode As Xml.XmlNode = FileInstance.GetNode(String.Format("//data" & Helper.CaseInsensitiveSearch("name"), "'" & ID.ToLower & "'"))

    If returnedNode IsNot Nothing Then
        Return returnedNode.ChildNodes(1).InnerText
    Else
        Return ""
    End If

End Function

''' <summary>
''' Adds a row to the target table.
''' </summary>
''' <param name="RowValues">The row values we want to insert. These are in order, so it is presumed the first row value in the array is for the first column 
''' of the target data table.</param>
''' <remarks></remarks>
Public Sub InsertRow(ByVal ParamArray RowValues() As String)

    'If the length of the RowValues is not equal the columns, that means that is an invalid insert. Throw exception.
    If RowValues.Length = dtTargetTable.Columns.Count Then
        'Creates a new row.
        Dim drNewRow As DataRow

        drNewRow = dtTargetTable.NewRow

        'Goes through the row values.
        For idx As Integer = 0 To RowValues.Length - 1
            'Store the value for the column.
            drNewRow(dtTargetTable.Columns(idx)) = RowValues(idx)
        Next
    Else
        Throw New Exception(String.Format("Invalid insert. The number of row values passed are not equal to the number of columns of the target dataTable." & _
                                          "The number of columns of the target dataTable are {0}.", dtTargetTable.Columns.Count))
    End If

End Sub

这篇关于抓取XML时,VB.NET For Loop速度太慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆