如何使用XmlSerializer反序列化大型文档中的节点 [英] How to deserialize a node in a large document using XmlSerializer
问题描述
我有一个很大的XML文档,已加载到 XmlDocument
中,我想使用 XmlSerializer
类将选择的元素反序列化为使用xsd.exe生成的.NET类。
I have a large XML document that I have loaded into an XmlDocument
and I want to use the XmlSerializer
class to deserialize selected elements from it into a .NET class generated using xsd.exe.
这是到目前为止我尝试过的MCVE; xsd和生成的类在帖子的末尾。如代码中的注释所述,我得到了 InvalidOperationException
-< Cars xmlns:'http:// MyNamespace'/>没想到
:
Here's an MCVE of what I've tried so far; the xsd and generated class are at the end of the post. As noted in the comments in the code, I am getting an InvalidOperationException
- <Cars xmlns:'http://MyNamespace' /> was not expected
:
static string XmlContent = @"
<RootNode xmlns=""http://MyNamespace"">
<Cars>
<Car make=""Volkswagen"" />
<Car make=""Ford"" />
<Car make=""Opel"" />
</Cars>
</RootNode>";
static void TestMcve()
{
var doc = new XmlDocument();
doc.LoadXml(XmlContent);
var nsMgr = new XmlNamespaceManager(doc.NameTable);
nsMgr.AddNamespace("myns", "http://MyNamespace");
var rootSerializer = new XmlSerializer(typeof(RootNode));
var root = (RootNode) rootSerializer.Deserialize(new XmlNodeReader(doc));
Console.WriteLine(root.Cars[0].make); // Works fine so far
var node = doc.DocumentElement.SelectSingleNode("myns:Cars", nsMgr);
Console.WriteLine(node.OuterXml);
var carSerializer = new XmlSerializer(typeof(Car));
using (var reader = new XmlNodeReader(node))
{
// What I want is a list of Car instances deserialized from
// the Car child elements of the Cars element.
// The following line throws an InvalidOperationException
// "<Cars xmlns:'http://MyNamespace' /> was not expected"
// If I change SelectSingleNode above to select "myns:Cars/myns:Car"
// I get "<Car xmlns:'http://MyNamespace' /> was not expected"
var result = carSerializer.Deserialize(reader);
}
}
我也想随后更新我的 Car
类实例,然后使用 XmlSerializer
将其插入文档中,这是后续问题如何使用XmlSerializer
在大型文档中插入节点。
I also want to subsequently update my Car
class instance, and insert it back into the document using the XmlSerializer
, which is the subject of a follow-up question How to insert a node in a large document using XmlSerializer
.
xsd和生成的类如下:
The xsd and generated classes follow:
<xs:schema xmlns="http://MyNamespace" xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://MyNamespace"
elementFormDefault="qualified" attributeFormDefault="unqualified"
version="3.9.0.8">
<xs:complexType name="Cars">
<xs:sequence>
<xs:element name="Car" type="Car" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="Car">
<xs:attribute name="make" type="xs:string" use="required"/>
</xs:complexType>
<xs:complexType name="RootNode">
<xs:sequence>
<xs:element name="Cars" type="Cars" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
<xs:element name="RootNode" type="RootNode" />
</xs:schema>
xsd.exe生成的代码:
Code generated by xsd.exe:
using System.Xml.Serialization;
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "4.6.1055.0")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://MyNamespace")]
[System.Xml.Serialization.XmlRootAttribute(Namespace="http://MyNamespace", IsNullable=false)]
public partial class RootNode {
private Car[] carsField;
/// <remarks/>
[System.Xml.Serialization.XmlArrayItemAttribute(IsNullable=false)]
public Car[] Cars {
get {
return this.carsField;
}
set {
this.carsField = value;
}
}
}
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "4.6.1055.0")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(Namespace="http://MyNamespace")]
public partial class Car {
private string makeField;
/// <remarks/>
[System.Xml.Serialization.XmlAttributeAttribute()]
public string make {
get {
return this.makeField;
}
set {
this.makeField = value;
}
}
}
推荐答案
这里您有两个问题:
-
var节点= doc.DocumentElement.SelectSingleNode ( myns:Cars,nsMgr);
放置在< Cars>
元素上–重复序列<的容器元素code>< Car> 节点-但是您的XmlSerializer
构造为反序列化名为的单个根元素< Car>
。尝试使用构造为对单个汽车进行反序列化的序列化器对汽车序列进行反序列化。
The
var node = doc.DocumentElement.SelectSingleNode("myns:Cars", nsMgr);
is positioned at the<Cars>
element -- the container element for the repeating sequence of<Car>
nodes -- but yourXmlSerializer
is constructed to deserialize a single root element named<Car>
. Trying to deserialize a sequence of cars with a serializer constructed to deserialize a single car will not work.
由于某些原因, xsd.exe
为您的 Car
类型生成了一个定义,而没有 XmlRoot
属性:
For some reason xsd.exe
generated a definition for your Car
type without an XmlRoot
attribute:
[System.Xml.Serialization.XmlTypeAttribute(Namespace = "http://MyNamespace")]
// Not included!
//[System.Xml.Serialization.XmlRootAttribute(Namespace = "http://MyNamespace")]
public partial class Car
{
}
因此,如果您尝试序列化或反序列化 Car的单个实例
作为XML文档的根XML元素,然后 XmlSerializer
将期望该根元素不在任何命名空间中。大文档中的每个< Car>
节点都位于 http:// MyNamespace
默认命名空间中,因此
Thus if you attempt to serialize or deserialize a single instance of a Car
as the root XML element of an XML document then XmlSerializer
will expect that root element to not be in any namespace. Each <Car>
node in your large document is in the "http://MyNamespace"
default namespace, so attempting to deserialize each one individually also will not work.
您可以手动添加缺少的 [XmlRoot(Namespace = http:// MyNamespace )]
属性设置为 Car
,但是如果随后修改了XSD文件并且需要重新生成c#类型,那么这样做可能会很麻烦。 。
You could manually add the missing [XmlRoot(Namespace = "http://MyNamespace")]
attribute to Car
, but having to do this can be a nuisance if the XSD files are subsequently modified and the c# types need to be regenerated.
要避免这两个问题,可以使用 XmlNode.SelectNodes(String,XmlNamespaceManager)
选择每个<$ c $ < Cars>
元素内的c>< Car> 节点,然后通过构造一个 XmlSerializer
并覆盖一个 XmlRootAttribute
与el反序列化的节点的名称和名称空间。首先,定义以下扩展方法:
To avoid both issues, you can use XmlNode.SelectNodes(String, XmlNamespaceManager)
to select every <Car>
nodes inside the <Cars>
element, then deserialize each one by constructing an XmlSerializer
with an override XmlRootAttribute
with the element name and namespace of the node being deserialized. First, define the following extension methods:
public static partial class XmlNodeExtensions
{
public static List<T> DeserializeList<T>(this XmlNodeList nodes)
{
return nodes.Cast<XmlNode>().Select(n => n.Deserialize<T>()).ToList();
}
public static T Deserialize<T>(this XmlNode node)
{
if (node == null)
return default(T);
var serializer = XmlSerializerFactory.Create(typeof(T), node.LocalName, node.NamespaceURI);
using (var reader = new XmlNodeReader(node))
{
return (T)serializer.Deserialize(reader);
}
}
}
public static class XmlSerializerFactory
{
// To avoid a memory leak the serializer must be cached.
// https://stackoverflow.com/questions/23897145/memory-leak-using-streamreader-and-xmlserializer
// This factory taken from
// https://stackoverflow.com/questions/34128757/wrap-properties-with-cdata-section-xml-serialization-c-sharp/34138648#34138648
readonly static Dictionary<Tuple<Type, string, string>, XmlSerializer> cache;
readonly static object padlock;
static XmlSerializerFactory()
{
padlock = new object();
cache = new Dictionary<Tuple<Type, string, string>, XmlSerializer>();
}
public static XmlSerializer Create(Type serializedType, string rootName, string rootNamespace)
{
if (serializedType == null)
throw new ArgumentNullException();
if (rootName == null && rootNamespace == null)
return new XmlSerializer(serializedType);
lock (padlock)
{
XmlSerializer serializer;
var key = Tuple.Create(serializedType, rootName, rootNamespace);
if (!cache.TryGetValue(key, out serializer))
cache[key] = serializer = new XmlSerializer(serializedType, new XmlRootAttribute { ElementName = rootName, Namespace = rootNamespace });
return serializer;
}
}
}
然后反序列化如下:
var nodes = doc.DocumentElement.SelectNodes("myns:Cars/myns:Car", nsMgr);
var cars = nodes.DeserializeList<Car>();
必须缓存具有覆盖根元素名称或名称空间的序列化程序的节点,以避免内存泄漏如此答案中所述> Marc Gravell 。
Node that a serializer constructed with an override root element name or namespace must be cached to avoid a memory leak as explained in this answer by Marc Gravell.
示例工作 .Net小提琴。
这篇关于如何使用XmlSerializer反序列化大型文档中的节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!