决定何时使用XmlDocument的VS的XmlReader [英] Deciding on when to use XmlDocument vs XmlReader

查看:124
本文介绍了决定何时使用XmlDocument的VS的XmlReader的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在优化自定义对象 - > XML序列化工具,而这一切都完成,工作,这不是问题。

I'm optimizing a custom object -> XML serialization utility, and it's all done and working and that's not the issue.

它的工作通过把文件加载到一个的XmlDocument 对象,然后递归经历所有的子节点。

It worked by loading a file into an XmlDocument object, then recursively going through all the child nodes.

我想,也许使用的XmlReader 的而不是的XmlDocument 加载/解析整个事情会更快,所以我实现了版本。

I figured that perhaps using XmlReader instead of having XmlDocument loading/parsing the entire thing would be faster, so I implemented that version as well.

的算法是完全一样的,我使用一个包装类来处理一个的XmlNode 与一个的XmlReader 。例如,的GetChildren 方法得到的回报不是一个孩子的XmlNode 或子树的XmlReader

The algorithms are exactly the same, I use a wrapper class to abstract the functionality of dealing with an XmlNode vs. an XmlReader. For instance, the GetChildren methods yield returns either a child XmlNode or a SubTree XmlReader.

所以我写了一个测试驱动程序来测试版本,并(与周围1350元一个900KB的XML文件)使用一个不平凡的数据集。

So I wrote a test driver to test both versions, and using a non-trivial data set (a 900kb XML file with around 1,350 elements).

不过,使用JetBrains公司dotTRACE,我看到的XmlReader 版本实际上比的XmlDocument 版本慢!它似乎有参与的XmlReader 一些显著处理阅读时,我遍历子节点调用。

However, using JetBrains dotTRACE, I see that the XmlReader version is actually slower than the XmlDocument version! It seems that there is some significant processing involved in XmlReader read calls when I'm iterating over child nodes.

所以我说一切都交给问这个:

So I say all that to ask this:

有哪些优点/的的XmlDocument 的XmlReader ,以及在什么情况下你应该使用的缺点要么?

What are the advantages/disadvantages of XmlDocument and XmlReader, and in what circumstances should you use either?

我的猜测是,有在该的XmlReader 变得在性能上更经济,以及更少的内存密集型文件大小阈值。然而,这道门槛似乎是1MB以上。

My guess is that there is a file size threshold at which XmlReader becomes more economical in performance, as well as less memory-intensive. However, that threshold seems to be above 1MB.

我打电话 ReadSubTree 每一个处理子节点时间:

I'm calling ReadSubTree every time to process child nodes:

public override IEnumerable<IXmlSourceProvider> GetChildren ()
{
    XmlReader xr = myXmlSource.ReadSubtree ();
    // skip past the current element
    xr.Read ();

    while (xr.Read ())
    {
        if (xr.NodeType != XmlNodeType.Element) continue;
        yield return new XmlReaderXmlSourceProvider (xr);
    }
}

这试验适用于大量的对象在一个级别(即宽和放大器;浅) - 但我不知道有多好的XmlReader 票价时,XML是深&安培;宽?即我处理的XML是很像一个数据对象模型,1父对象很多子对象,等等: 1..M..M..M

That test applies to a lot of objects at a single level (i.e. wide & shallow) - but I wonder how well XmlReader fares when the XML is deep & wide? I.e. the XML I'm dealing with is much like a data object model, 1 parent object to many child objects, etc: 1..M..M..M

我也不知道事先我解析XML的结构,所以我不能优化它。

I also don't know beforehand the structure of the XML I'm parsing, so I can't optimize for it.

推荐答案

我一般看它的不是从最快的角度看,而是从内存利用率透视。所有的实现已经足够快的我用他们在使用场景(典型的企业整合)。

I've generally looked at it not from a fastest perspective, but rather from a memory utilization perspective. All of the implementations have been fast enough for the usage scenarios I've used them in (typical enterprise integration).

然而,当我跌倒了,有时壮观,是不是考虑到我的工作的XML的一般大小。如果你仔细想想前面,你可以节省自己的一些悲伤。

However, where I've fallen down, and sometimes spectacularly, is not taking into account the general size of the XML I'm working with. If you think about it up front you can save yourself some grief.

XML往往当加载到内存中膨胀,至少像的XmlDocument A DOM阅读器或的XPathDocument 。像10:1?确切的量是难以量化,但如果它是1MB在磁盘上这将是在存储器10MB,或更多,例如。

XML tends to bloat when loaded into memory, at least with a DOM reader like XmlDocument or XPathDocument. Something like 10:1? The exact amount is hard to quantify, but if it's 1MB on disk it will be 10MB in memory, or more, for example.

使用加载整个文档到内存整体的读写器进程(的XmlDocument / 的XPathDocument )可用从大对象堆碎片,这最终可能会导致 OutOfMemoryException异常遭受秒(甚至可用内存),导致不可用的服务/进程。

A process using any reader that loads the whole document into memory in its entirety (XmlDocument/XPathDocument) can suffer from large object heap fragmentation, which can ultimately lead to OutOfMemoryExceptions (even with available memory) resulting in an unavailable service/process.

由于在尺寸大于85K更大的物体结束了对大对象堆,并且你已经有了一个10:1的大小爆炸与DOM的读者,你可以看到它并不需要太多的前XML文档是从大对象堆被分配。

Since objects that are greater than 85K in size end up on the large object heap, and you've got a 10:1 size explosion with a DOM reader, you can see it doesn't take much before your XML documents are being allocated from the large object heap.

<一个href=\"http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.aspx\"><$c$c>XmlDocument是非常容易使用。它的唯一缺点是,它加载整个XML文档到内存中的过程。它的诱惑简单易用。

XmlDocument is very easy to use. Its only real drawback is that it loads the whole XML document into memory to process. Its seductively simple to use.

的XmlReader 是一个基于流的读者这样将会使您的进程的内存使用率一般较平坦,但比较难用。

XmlReader is a stream based reader so will keep your process memory utilization generally flatter but is more difficult to use.

<一个href=\"http://msdn.microsoft.com/en-us/library/system.xml.xpath.xpathdocument.aspx\"><$c$c>XPathDocument往往是一个更快,只读版本的XmlDocument的,但是从内存'膨胀'仍然受到影响。

XPathDocument tends to be a faster, read-only version of XmlDocument, but still suffers from memory 'bloat'.

这篇关于决定何时使用XmlDocument的VS的XmlReader的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆