为XmlReader提供EntityDeclarations列表? [英] Provide XmlReader with a list of EntityDeclarations?

查看:76
本文介绍了为XmlReader提供EntityDeclarations列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试将XML文件加载为格式良好的XML(无需验证或任何内容)来运行一些XPath查询。大多数(如果不是全部)这些文件引用使用DOCTYPE声明的DTD或使用命名空间(或noNamespaceSchemaLocation)的XML模式。
不可能包含可能从任何文件引用的每个和任何DTD / XSD,因此我只是禁用了验证并将XmlResolver设置为null以避免XmlReader尝试查找它们。

I am currently trying to load XML files as just well-formed XML (without validation or anything) to run a few XPath Queries. Most if not all of those files reference either DTDs using a DOCTYPE Declaration or XML-Schemas using a Namespace (or noNamespaceSchemaLocation). It is not possible to include each and any DTD/XSD that might be referenced from any of the files, so I simply disabled Validation and set the XmlResolver to null to avoid XmlReader trying to find them.

然而,通过这样做,当文档包含命名的字符实体(例如& trade; for Trademark-Symbol)时,我会遇到特定的问题,因为它们通常会在引用的DTD或Schema中定义。结果,解析失败(显然),
,因为未解析的实体无法保证结构良好。

However, by doing so, I run into specific problems when the document contains named Character Entities (such as ™ for the Trademark-Symbol) since they would typically be defined in the referenced DTD or Schema. As Result, parsing fails (obviously), since the unresolved Entity cannot guarantee even well-formedness.

我得到了所有可能的实体(至少我所知道的可能是实体)在那些文件中)作为DTD片段(并且可能作为XSD片段,如果DTD不够)并且可以包括它们 - 幸运的是,从DTD /模式引用的那些通常是
标准ISO实体,因此相同对于他们所有人。

I got all possibly Entities (at least the ones I know of that might be in those documents) as DTD fragments (and possibly as XSD fragments, in case DTD is not enough) and could include them - luckily the ones referenced from the DTDs/Schemas are usually standard ISO Entities, and thus the same for all of them.

我尝试了很多事情,看看我是否至少可以看到XmlReader如何解析实体以及如何弄乱结果:

- 我试着查看一个Reflector'd版本的XmlReader(和XmlTextReader,以及XmlTextReaderImpl和what-not),Mono的开源版本和SgmlReader的MindTouch实现 - 它们都没有真正帮助我(主要是由于复杂性。)
- 我尝试实现一个自定义的XmlReader(以及相应的XmlWriter)来捕获XmlNodeType.EntityReference并将它们替换为&放大器;实体; (反之亦然在Writer中) - 这在某种程度上起作用,但是打破内联定义的实体,在doctype的内部子集中直接
,使得这个解决方案对于我真正想要的东西而言毫无意义解决因为我们可以。

- 我试着搞乱其余的XmlReader(尤其是ResolveEntity)来看看我是否可以改变它的结果;但是一切似乎都在内部处理/委托给非公开课;并且它只是返回void,从
普通视线中隐藏它的黑魔法。

- 我尝试实现一个自定义的XmlResolver,它尝试将所有已知的ISO实体文件提供给XmlReader请求 - 这在XmlResolver上失败从未被调用(由于我不知道的原因;也许它会起作用?)

- 我尝试通过传递带有我的自定义InternalSubset的XmlParserContext来覆盖SchemaInfo行为。定义所有实体 - 只要原始文档包含其自己的DOCTYPE声明("不能包含具有多个DOCTYPE
定义的文档"或类似内容),此操作就会失败。

I tried quite a few things to see if I can at least get to see how XmlReader resolves the Entities and how I could mess with the result:
- I tried looking at a Reflector'd version of XmlReader (and XmlTextReader, and XmlTextReaderImpl and what-not), the open-source Version from Mono and the MindTouch implementation of SgmlReader - none of them really helped me (mostly due to complexity).
- I tried implementing a custom XmlReader (and corresponding XmlWriter) to catch XmlNodeType.EntityReference and replace them by &entity; (and vice-versa in the Writer) - this works to some degree, but breaks Entities that are defined inline, directly in the internal subset of the doctype, making this solution rather pointless for things that I'd actually want resolved because we could.
- I tried messing around with the rest of XmlReader (especially ResolveEntity) to see if I could change its outcome; but everything seems to be internally handled/delegated to non-public classes; and it simply returns void, doing its black magic hidden from plain sight.
- I tried implementing a custom XmlResolver that attempts to feed all my known ISO Entity files to the XmlReader when something is requested - this failed on the XmlResolver never being called (for reasons I don't know; maybe it would have worked?)
- I tried overriding the SchemaInfo behavior by passing a XmlParserContext with my custom InternalSubset which defines all entities - this fails as soon as the original document contains its own DOCTYPE declaration ("Cannot have document with multiple DOCTYPE definitions" or something like that).

全部搞乱基本上让我回到零。搜索互联网让我觉得我是第一个尝试这样做的人;或者至少第一个敢于问的人。

All that messing around basically gets me back to zero. Searching the internets leaves me with the impression that I'm the first trying to do something like that; or at least the first one that dares to ask.

有没有人能够了解会发生什么,我如何影响它,或者如何实现这一目标的一般提示?

Could anyone shed some light on what happens, how I could affect it, or general hints on how to achieve this?

问候,BhaaL

推荐答案

嗨Bhaal,

Hi Bhaal,

基于您的描述,我认为您的问题与XML更相关。所以它更适合XML,System.Xml,MSXML和XmlLite论坛。它是使用.NET
框架,XML Lite,LINQ to XML和Visual Studio中的XML工具处理XML,MSXML,XSLT和/或XSD的问题和讨论的论坛。我将把它移到XML,System.Xml,MSXML和XmlLite论坛。感谢您的理解。

Based on your description, I think your issue is more related to XML. So it is more appropriate in the XML, System.Xml, MSXML and XmlLite forum. It is a forum for questions and discussion about the processing of XML, MSXML, XSLT and/or XSD using .NET frameworks, XML Lite, LINQ to XML, and the XML Tools in Visual Studio. I will move it to XML, System.Xml, MSXML and XmlLite forum. Thanks for understanding.


这篇关于为XmlReader提供EntityDeclarations列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆