什么是解析在C#code(大)XML的最好方法? [英] What is the best way to parse (big) XML in C# Code?

查看:93
本文介绍了什么是解析在C#code(大)XML的最好方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从服务器写在C#GIS客户端工具在基于GML-XML模式来检索功能(下面的示例)。萃取物被限制到100,000的功能。

我guestimate是最大的 extract.xml 的可能起床约150兆,所以显然DOM解析器是出我一直在试图之间作出选择<一个href=\"http://msdn.microsoft.com/en-us/library/system.xml.serialization.xmlserializer.aspx\">XmlSerializer和 XSD.EXE 生成绑定 - 或 - XmlReader 和手工制作的对象图。

或者,也许有我还没有考虑一个更好的办法?像XLINQ,或????

请任何人可以指导我?尤其是关于任何给定方法的内存效率。如果没有,我会得原型这两种解决方案和轮廓它们并排侧。

我有点.NET中生对虾。任何指导,将大大AP preciated。

感谢你。基思。


示例XML - 高达十万人,每个功能高达234,600 COORDS的

 &LT;特征featId =27168306FTYPE =草木fTypeId =1129FCLASS =草木GTYPE =多边形ID =0cLockNr =51598 metadataId =51599mdFileId =NRM / TIS /植被/ 9543_22_v3dataScale =25000&GT;
  &LT; MultiGeometry&GT;
    &LT; geometryMember&GT;
      &LT;&多边形GT;
        &LT; outerBoundaryIs&GT;
          &LT;线性环&GT;
            &LT;&坐标GT; 153.505004,-27.42196 153.505044,-27.422015 153.503992 .... 172坐标省略以节省空间... 153.505004,-27.42196&LT; /坐标&GT;
          &LT; /线性环&GT;
        &LT; / outerBoundaryIs&GT;
      &LT; /多边形&GT;
    &LT; / geometryMember&GT;
  &LT; / MultiGeometry&GT;
&LT; /功能&GT;


解决方案

使用的XmlReader 解析大型XML文档。 的XmlReader 提供了快速,只进,非缓存对XML数据的访问。 (只进意味着你可以阅读从开始的XML文件来结束,但不能在文件中向后移动。)的XmlReader 使用少量的内存,并且是等同于使用简单的SAX读取器。

 使用(XmlReader中myReader = XmlReader.Create(@C:\\ DATA \\ coords.xml))
    {
        而(myReader.Read())
        {
           //每个工艺节点(myReader.Value)这里
           // ...
        }
    }

您可以使用的XmlReader来处理高达大小为2千兆字节(GB)的文件。

参考:如何使用Visual C#来从文件中读取XML

I'm writing a GIS client tool in C# to retrieve "features" in a GML-based XML schema (sample below) from a server. Extracts are limited to 100,000 features.

I guestimate that the largest extract.xml might get up around 150 megabytes, so obviously DOM parsers are out I've been trying to decide between XmlSerializer and XSD.EXE generated bindings --OR-- XmlReader and a hand-crafted object graph.

Or maybe there's a better way which I haven't considered yet? Like XLINQ, or ????

Please can anybody guide me? Especially with regards to the memory efficiency of any given approach. If not I'll have to "prototype" both solutions and profile them side-by-side.

I'm a bit of a raw prawn in .NET. Any guidance would be greatly appreciated.

Thanking you. Keith.


Sample XML - upto 100,000 of them, of upto 234,600 coords per feature.

<feature featId="27168306" fType="vegetation" fTypeId="1129" fClass="vegetation" gType="Polygon" ID="0" cLockNr="51598" metadataId="51599" mdFileId="NRM/TIS/VEGETATION/9543_22_v3" dataScale="25000">
  <MultiGeometry>
    <geometryMember>
      <Polygon>
        <outerBoundaryIs>
          <LinearRing>
            <coordinates>153.505004,-27.42196 153.505044,-27.422015 153.503992 .... 172 coordinates omitted to save space ... 153.505004,-27.42196</coordinates>
          </LinearRing>
        </outerBoundaryIs>
      </Polygon>
    </geometryMember>
  </MultiGeometry>
</feature>

解决方案

Use XmlReader to parse large XML documents. XmlReader provides fast, forward-only, non-cached access to XML data. (Forward-only means you can read the XML file from beginning to end but cannot move backwards in the file.) XmlReader uses small amounts of memory, and is equivalent to using a simple SAX reader.

    using (XmlReader myReader = XmlReader.Create(@"c:\data\coords.xml"))
    {
        while (myReader.Read())
        {
           // Process each node (myReader.Value) here
           // ...
        }
    }

You can use XmlReader to process files that are up to 2 gigabytes (GB) in size.

Ref: How to read XML from a file by using Visual C#

这篇关于什么是解析在C#code(大)XML的最好方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆