使用 xmlReader 过滤 C# 中特定元素值的大 XML [英] Filter Large XML for specific element value in C# Using xmlReader
问题描述
在这个线程中:在 C# 中过滤特定元素值的 XML>
我能够使用 XDocument
过滤 xml
文件以查找特定元素.但是,对于巨大的 xml
文件,XDocument
似乎不是一个可行的解决方案,因为它失败并显示 System.OutOfMemoryException
消息.仔细研究,看起来 xmlReader
在处理大型 xmls
时内存效率更高.
如何重写接受的答案,使用xmlReader
,以获得相同的结果?
请尝试以下解决方案.
它具有很强的可扩展性,可以毫无问题地处理多 GB 大小的 XML 文件.
XStreamingElement 使用了一种扩展方法,该方法使用 XmlReader<流式传输由
节点过滤的源 XML/代码>.
c#
void Main(){const string inputXMLFile = @e:\Temp\Sanosi.xml";const string outputXMLFile = @e:\Temp\Sanosi_Streamed.xml";const string ROW =条目";const string FILTER = "Section 1";//将 XML 流式传输到文件系统System.Diagnostics.Stopwatch timer = new System.Diagnostics.Stopwatch();定时器开始();//形状输出 XMLXStreamingElement newXML = new XStreamingElement("root",来自 StreamElements(inputXMLFile, ROW) 中的元素.Where(x => x.Element(section").Value.Equals(FILTER))select new XElement(ROW, element.Elements("image")));newXML.Save(outputXMLFile, SaveOptions.OmitDuplicateNamespaces);FileInfo fileBefore = new FileInfo(inputXMLFile);FileInfo fileAfter = new FileInfo(outputXMLFile);定时器.停止();Console.WriteLine("Streamed XML file '{0}', {1} bytes to file system as: '{2}', {3} bytes{5}Elapsed time: {4}",fileBefore.FullName, fileBefore.Length, fileAfter.FullName, fileAfter.Length, timer.Elapsed, 环境.NewLine);}私有静态 IEnumerable流元素(字符串文件名,字符串元素名){使用 (var rdr = XmlReader.Create(fileName)){rdr.MoveToContent();而 (rdr.Read()){if ((rdr.NodeType == XmlNodeType.Element) && (rdr.Name == elementName)){var e = XElement.ReadFrom(rdr) as XElement;收益率e;}}rdr.关闭();}}
In this thread: Filter XML for specific element value in C#
I was able to filter xml
files to look for specific elements using XDocument
. However, with huge xml
files, it seems XDocument
is not a feasible solution as it fails with System.OutOfMemoryException
message. Digging around, it looks like xmlReader
is more memory efficient when handling large xmls
.
How to re-write the accepted answer, to use xmlReader
, to get the same result?
Please try the following solution.
It is very scalable and can process multi GB size XML files without any problem.
The XStreamingElement is using an extension method that streams the source XML filtered by a <section>Section 1</section>
node using an XmlReader
.
c#
void Main()
{
const string inputXMLFile = @"e:\Temp\Sanosi.xml";
const string outputXMLFile = @"e:\Temp\Sanosi_Streamed.xml";
const string ROW = "Entry";
const string FILTER = "Section 1";
// Stream XML to file system
System.Diagnostics.Stopwatch timer = new System.Diagnostics.Stopwatch();
timer.Start();
// Shape output XML
XStreamingElement newXML = new XStreamingElement("root",
from element in StreamElements(inputXMLFile, ROW)
.Where(x => x.Element("section").Value.Equals(FILTER))
select new XElement(ROW, element.Elements("image")
));
newXML.Save(outputXMLFile, SaveOptions.OmitDuplicateNamespaces);
FileInfo fileBefore = new FileInfo(inputXMLFile);
FileInfo fileAfter = new FileInfo(outputXMLFile);
timer.Stop();
Console.WriteLine("Streamed XML file '{0}', {1} bytes to file system as: '{2}', {3} bytes{5}Elapsed time: {4}",
fileBefore.FullName
, fileBefore.Length
, fileAfter.FullName
, fileAfter.Length
, timer.Elapsed
, Environment.NewLine);
}
private static IEnumerable<XElement> StreamElements(string fileName, string elementName)
{
using (var rdr = XmlReader.Create(fileName))
{
rdr.MoveToContent();
while (rdr.Read())
{
if ((rdr.NodeType == XmlNodeType.Element) && (rdr.Name == elementName))
{
var e = XElement.ReadFrom(rdr) as XElement;
yield return e;
}
}
rdr.Close();
}
}
这篇关于使用 xmlReader 过滤 C# 中特定元素值的大 XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!