反序列化时出现意外的节点类型元素错误 [英] Unexpected node type element error when deserialising

查看:21
本文介绍了反序列化时出现意外的节点类型元素错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解析用 XML 编写的大型日语到英语词典.一个典型的条目如下所示:

I'm attempting to parse a large Japanese to English dictionary written in XML. A typical entry looks like this:

<entry>
<ent_seq>1486440</ent_seq>
<k_ele>
<keb>美術</keb>
<ke_pri>ichi1</ke_pri>
<ke_pri>news1</ke_pri>
<ke_pri>nf02</ke_pri>
</k_ele>
<r_ele>
<reb>びじゅつ</reb>
<re_pri>ichi1</re_pri>
<re_pri>news1</re_pri>
<re_pri>nf02</re_pri>
</r_ele>
<sense>
<pos>&n;</pos>
<pos>&adj-no;</pos>
<gloss>art</gloss>
<gloss>fine arts</gloss>
</sense>
<sense>
<gloss xml:lang="dut">kunst</gloss>
<gloss xml:lang="dut">schone kunsten</gloss>
</sense>
<sense>
<gloss xml:lang="fre">art</gloss>
<gloss xml:lang="fre">beaux-arts</gloss>
</sense>
<sense>
<gloss xml:lang="ger">Kunst</gloss>
<gloss xml:lang="ger">die schönen Künste</gloss>
<gloss xml:lang="ger">bildende Kunst</gloss>
</sense>
<sense>
<gloss xml:lang="ger">Produktionsdesign</gloss>
<gloss xml:lang="ger">Szenographie</gloss>
</sense>
<sense>
<gloss xml:lang="hun">művészet</gloss>
<gloss xml:lang="hun">művészeti</gloss>
<gloss xml:lang="hun">művészi</gloss>
<gloss xml:lang="hun">rajzóra</gloss>
<gloss xml:lang="hun">szépművészet</gloss>
</sense>
<sense>
<gloss xml:lang="rus">изящные искусства; искусство</gloss>
<gloss xml:lang="rus">{~{的}} художественный, артистический</gloss>
</sense>
<sense>
<gloss xml:lang="slv">umetnost</gloss>
<gloss xml:lang="slv">likovna umetnost</gloss>
</sense>
<sense>
<gloss xml:lang="spa">bellas artes</gloss>
</sense>
</entry>

我根据 djv 在 this answer 中提供的代码编写了一个反序列化器,它确实反序列化了整个字典转换为一系列类对象.这是我目前得到的代码:

I've written a deserialiser based on code provided by djv in this answer, and it does indeed deserialise the entire dictionary into a series of class objects. Here is the code I've got so far:

ReadOnly jmdictpath As String = "JMdict"

<XmlRoot>
Public Class JMdict
    <XmlElement("entry")>
    Public Property entrylist As List(Of entry)
End Class

<Serializable()>
Public Class entry
    Public Property ent_seq As Integer
    Public Property k_ele As k_ele
    Public Property r_ele As r_ele
    <XmlElement("sense")>
    Public Property senselist As List(Of sense)
End Class

<Serializable()>
Public Class k_ele
    Public Property keb As String
    Public Property ke_pri As List(Of String)
    Public Property ke_inf As List(Of String)
End Class

<Serializable()>
Public Class r_ele
    Public Property reb As String
    Public Property re_pri As List(Of String)
    Public Property ke_inf As List(Of String)
End Class

<Serializable()>
Public Class sense
    <XmlElement("pos")>
    Public Property pos As List(Of string)
    <XmlElement("gloss")>
    Public Property gloss As List(Of gloss)
End Class

<Serializable()>
Public Class gloss
    <XmlAttribute("xml:lang")>
    Public Property lang As String
    <XmlAttribute("g_type")>
    Public Property g_type As String
    <XmlText>
    Public Property Text As String
    Public Overrides Function ToString() As String
        Return Text
    End Function
End Class

Dim dict As JMdict

Sub Deserialise()
    Dim serialiser As New XmlSerializer(GetType(JMdict))
    Using sr As New StreamReader(jmdictpath)
        dict = CType(serialiser.Deserialize(sr), JMdict)
    End Using
End Sub

但是,当我运行代码时,出现以下错误:

When I run the code, however, I get the following error:

System.InvalidOperationException: 'XML 文档中存在错误 (415, 7).'

System.InvalidOperationException: 'There is an error in XML document (415, 7).'

XmlException:意外的节点类型 EntityReference.ReadElementString 方法只能在具有简单或空内容的元素上调用.第 415 行,位置 7.

XmlException: Unexpected node type EntityReference. ReadElementString method can only be called on elements with simple or empty content. Line 415, position 7.

我检查了 XML,第 415 行是这一行:

I've checked the XML, and line 415 is this line:

 <pos>&unc;</pos>

所以解串器在读取 标签时遇到问题.所以我尝试了一些东西.

So the deserialiser is having problems reading the <pos> tag. So I tried a few things.

首先,我尝试删除 sense 类中 pos 标记.这样做意味着没有错误,而且,反序列化器根本没有读取任何条目的 pos 的任何数据.

First I tried removing the <XMLElement> tag for pos in the sense class. Doing this meant that there was no error, but also, the deserialiser simply didn't read any data for pos for any of the entries.

其次,我检查了 StackOverflow 并发现了 这个相关问题,其中 OP 有同样的问题.这个问题中接受的答案建议将数据拆分成更多的类,所以我也尝试过,并创建了一个新的pos 类:

Second, I checked on StackOverflow and found this related question where OP had the same problem. The accepted answer in this question suggested splitting the data into further classes, so I tried that too, and created a new pos class:

<Serializable()>
Public Class sense
    <XmlElement("pos")>
    Public Property pos As List(Of pos)
    <XmlElement("gloss")>
    Public Property gloss As List(Of gloss)
End Class

<Serializable()>
Public Class pos
    <XmlText>
    Public Property Text As String
    Public Overrides Function ToString() As String
        Return Text
    End Function
End Class

再一次,虽然这没有导致任何错误,但 pos 元素在每个条目中都是空白的.每个 pos 标签只包含一个值 - 尽管每个 sense 标签可以有多个 pos 标签 - 所以我认为它不应该需要自己的类对象.无论如何,这个答案并没有解决我的问题,因此我为什么要问这个问题.

And once again, while this caused no errors, the pos element was blank in every entry. Each pos tag only contains one value - although there can be more than one pos tag per sense tag - so I didn't think it should need its own class object. In any case, this answer didn't solve my problem, hence why I'm asking this question.

我对 XML 反序列化完全陌生,并没有真正理解我在做什么深入 - 我试图根据 这个有用的答案,但我显然在这里做错了.任何建议将不胜感激.

I am completely new to XML deserialisation, and don't really understand what I'm doing in-depth - I'm trying to figure out the mechanics of it based on this helpful answer, but I'm obviously doing something wrong here. Any advice would be appreciated.

推荐答案

您只需要创建带有 XmlReaderXmlSerializer 和正确配置的 XmlReaderSettings.您唯一需要在设置中配置的是 DtdProcessing 属性 将其设置为等于 DtdProcessing.Parse.

You just need to create the XmlSerializer with a XmlReaderwith the properly configured XmlReaderSettings. The only thing you need to configure in the settings is the DtdProcessing Property setting it equal to DtdProcessing.Parse.

Dim settings As XmlReaderSettings = New XmlReaderSettings()
settings.DtdProcessing = DtdProcessing.Parse

Dim xmlPath As String = Path.Combine(Application.StartupPath, "yourfilename.xml")

Dim ser As New XmlSerializer(GetType(JMdict))

Dim JMdictInstance As JMdict
Using rdr As XmlReader = XmlReader.Create(xmlPath, settings)
   JMdictInstance = CType(ser.Deserialize(rdr), JMdict)
End Using

这篇关于反序列化时出现意外的节点类型元素错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆