在 VB.net 中使用 XMLReader 读取大型 XML 文件 [英] Reading large XML file using XMLReader in VB.net

查看:24
本文介绍了在 VB.net 中使用 XMLReader 读取大型 XML 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 35 GB 的 XML 文件.我尝试使用 xmldocument 加载此文件并出现内存不足异常.因此,使用 xmlreader 解析 xml 数据以将其加载到数据库.但是,我无法读取父节点中的子节点.

I have a XML file size 35 GB. I tried to load this file using xmldocument and got out of memory exception. So, using xmlreader to parse the xml data to load it to database. But, I am not able to read the child nodes within a parent node.

示例 XML 文件内容:

Example XML file content:

文件名:wcproduction.xml

File name : wcproduction.xml

<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsd:schema targetNamespace="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified">
    <xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd"/>
    <xsd:element name="wcproduction">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="api_st_cde" type="sqltypes:smallint" nillable="1"/>
                <xsd:element name="api_cnty_cde" type="sqltypes:smallint" nillable="1"/>
                <xsd:element name="api_well_idn" type="sqltypes:int" nillable="1"/>
                <xsd:element name="pool_idn" type="sqltypes:int" nillable="1"/>
                <xsd:element name="prodn_mth" type="sqltypes:smallint" nillable="1"/>
                <xsd:element name="prodn_yr" type="sqltypes:int" nillable="1"/>
                <xsd:element name="ogrid_cde" type="sqltypes:int" nillable="1"/>
                <xsd:element name="prd_knd_cde" nillable="1">
                    <xsd:simpleType>
                        <xsd:restriction base="sqltypes:char" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
                            <xsd:maxLength value="2"/>
                        </xsd:restriction>
                    </xsd:simpleType>
                </xsd:element>
                <xsd:element name="eff_dte" type="sqltypes:datetime" nillable="1"/>
                <xsd:element name="amend_ind" nillable="1">
                    <xsd:simpleType>
                        <xsd:restriction base="sqltypes:char" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
                            <xsd:maxLength value="1"/>
                        </xsd:restriction>
                    </xsd:simpleType>
                </xsd:element>
                <xsd:element name="c115_wc_stat_cde" nillable="1">
                    <xsd:simpleType>
                        <xsd:restriction base="sqltypes:char" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
                            <xsd:maxLength value="1"/>
                        </xsd:restriction>
                    </xsd:simpleType>
                </xsd:element>
                <xsd:element name="prod_amt" type="sqltypes:int" nillable="1"/>
                <xsd:element name="prodn_day_num" type="sqltypes:smallint" nillable="1"/>
                <xsd:element name="mod_dte" type="sqltypes:datetime" nillable="1"/>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>
<wcproduction xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
    <api_st_cde>30</api_st_cde>
    <api_cnty_cde>5</api_cnty_cde>
    <api_well_idn>20178</api_well_idn>
    <pool_idn>10540</pool_idn>
    <prodn_mth>7</prodn_mth>
    <prodn_yr>1973</prodn_yr>
    <ogrid_cde>12437</ogrid_cde>
    <prd_knd_cde>G </prd_knd_cde>
    <eff_dte>1973-07-31T00:00:00</eff_dte>
    <amend_ind>N</amend_ind>
    <c115_wc_stat_cde>F</c115_wc_stat_cde>
    <prod_amt>53612</prod_amt>
    <prodn_day_num>99</prodn_day_num>
    <mod_dte>2015-04-07T07:31:00.173</mod_dte>
</wcproduction>
</root>

尝试读取父节点 wcproduction 及其子节点 ( api_st_cde, api_cnty_cde, ...) 的 VB.net 代码

VB.net code that is try to read parent node wcproduction and its child nodes ( api_st_cde, api_cnty_cde, ...)

Dim settings As XmlReaderSettings = New XmlReaderSettings()
        settings.IgnoreWhitespace = True

        Using reader As XmlReader = XmlReader.Create("D:\\wcproduction.xml", settings)

            reader.ReadToFollowing("wcproduction")
            Do

                Dim inner As XmlReader = reader.ReadSubtree()
                Dim str As String = ""

                inner.ReadToDescendant("api_st_cde")
                str = inner.ReadInnerXml
                inner.ReadToDescendant("api_cnty_cde")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("api_well_idn")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("pool_idn")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prodn_mth")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prodn_yr")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("ogrid_cde")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prd_knd_cde")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("eff_dte")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("amend_ind")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("c115_wc_stat_cde")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prod_amt")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("prodn_day_num")
                str = str & ", " & inner.ReadInnerXml
                inner.ReadToDescendant("mod_dte")
                str = str & ", " & inner.ReadInnerXml
                MsgBox(str)
                inner.Close()
            Loop While (reader.ReadToNextSibling("wcproduction"))

        End Using

我想读取所有节点(wcproduction)及其子节点并将其上传到SQL服务器.

I want to read and upload all nodes (wcproduction) and its child nodes to SQL server.

推荐答案

尝试使用 XmlReader 和 xml linq 组合的代码

Try following code which uses combination of XmlReader and xml linq

Imports System.Xml
Imports System.Xml.Linq
Module Module1
    Const FILENAME As String = "c:\temp\test.xml"
    Sub Main()
        Dim wcProdcutions As New List(Of WCProduction)
        Dim reader As XmlReader = XmlReader.Create(FILENAME)
        While (Not reader.EOF)
            If reader.Name <> "wcproduction" Then
                reader.ReadToFollowing("wcproduction")
            End If
            If Not reader.EOF Then
                Dim xWcproduction As XElement = XElement.ReadFrom(reader)
                Dim ns As XNamespace = xWcproduction.GetDefaultNamespace()
                Dim wcproduction As New WCProduction
                wcProdcutions.Add(wcproduction)

                wcproduction.api_st_cde = CType(xWcproduction.Element(ns + "api_st_cde"), Integer)
                wcproduction.api_well_idn = CType(xWcproduction.Element(ns + "api_well_idn"), Integer)
                wcproduction.pool_idn = CType(xWcproduction.Element(ns + "pool_idn"), Integer)
                wcproduction.prodn_mth = CType(xWcproduction.Element(ns + "prodn_mth"), Integer)
                wcproduction.prodn_yr = CType(xWcproduction.Element(ns + "prodn_yr"), Integer)
                wcproduction.ogrid_cde = CType(xWcproduction.Element(ns + "ogrid_cde"), Integer)
                wcproduction.prd_knd_cde = CType(xWcproduction.Element(ns + "prd_knd_cde"), String)
                wcproduction.eff_dte = CType(xWcproduction.Element(ns + "eff_dte"), DateTime)
                wcproduction.amend_ind = CType(xWcproduction.Element(ns + "amend_ind"), String)
                wcproduction.c115_wc_stat_cde = CType(xWcproduction.Element(ns + "c115_wc_stat_cde"), String)
                wcproduction.prod_amt = CType(xWcproduction.Element(ns + "prod_amt"), Integer)
                wcproduction.prodn_day_num = CType(xWcproduction.Element(ns + "prodn_day_num"), Integer)
                wcproduction.mod_dte = CType(xWcproduction.Element(ns + "mod_dte"), DateTime)

            End If

        End While
    End Sub

End Module
Public Class WCProduction
    Public api_st_cde As Integer
    Public api_cnty_cde As Integer
    Public api_well_idn As Integer
    Public pool_idn As Integer
    Public prodn_mth As Integer
    Public prodn_yr As Integer
    Public ogrid_cde As Integer
    Public prd_knd_cde As String
    Public eff_dte As DateTime
    Public amend_ind As String
    Public c115_wc_stat_cde As String
    Public prod_amt As Integer
    Public prodn_day_num As Integer
    Public mod_dte As DateTime
End Class

这篇关于在 VB.net 中使用 XMLReader 读取大型 XML 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆