是否可以获得 XSD 中定义的 XML 节点的类型? [英] Is it possible to get the type of an XML node as it was defined in XSD?

查看:31
本文介绍了是否可以获得 XSD 中定义的 XML 节点的类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用 python 解析 XML.我有一个 XSD 架构来验证 XML.我能否获得 XSD 中定义的 XML 特定节点的类型?

I'm parsing an XML in python. I've an XSD schema to validate the XML. Can I get the type of a particular node of my XML as it was defined in XSD?

比如我的 XML(小部分)是

For example, my XML (small part) is

<deviceDescription>
  <wakeupNote>
    <lang xml:lang="ru">Русский</lang>
    <lang xml:lang="en">English</lang>
  </wakeupNote> 
</deviceDescription>

我的 XSD 是(又是它的一小部分):

My XSD is (once again a small part of it):

<xsd:element name="deviceDescription" type="zwv:deviceDescription" minOccurs="0"/>

<xsd:complexType name="deviceDescription">
  <xsd:sequence>
    <xsd:element name="wakeupNote" type="zwv:description" minOccurs="0">
      <xsd:unique name="langDescrUnique">
        <xsd:selector xpath="zwv:lang"/> 
        <xsd:field xpath="@xml:lang"/>  
      </xsd:unique>
    </xsd:element> 
  </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="description">
  <xsd:sequence>
    <xsd:element name="lang" maxOccurs="unbounded">
      <xsd:complexType>
        <xsd:simpleContent>
          <xsd:extension base="xsd:string">
            <xsd:attribute ref="xml:lang" use="required"/>
          </xsd:extension>
        </xsd:simpleContent>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence> 
</xsd:complexType>

在解析过程中,我想知道我的标签 wakeupNote 在 XSD 中定义为 complexType zwv:description.如何做到这一点(在 python 中)?

During the parse I want to know that my tag wakeupNote is defined in XSD as complexType zwv:description. How to do this (in python)?

我需要这个做什么?假设我有很多这样的 XML,我想检查它们是否都有填充了英语的字段.很容易检查 </lang> 是否为空,但允许根本不指定此标签.

What do I need this for? Suppose I have a lot of these XMLs and I want to check that all of them have fields with English language filled. It would be easy to check that the <lang xml:lang="en"></lang> is empty, but it is allowed not to specify this tag at all.

因此,我们的想法是获取所有可能具有语言描述的标签,并检查 标签是否存在,并且 en 的内容是否为空.

So the idea is to get all tags that may have language descriptions and check that <lang> tag is present and has a non-empty content for en.

更新

因为在验证期间我的 XML 是根据 XSD 检查的,所以验证引擎知道所有节点的类型.7 个月前我有一个类似的问题,但仍然没有答案.他们是相关的,恕我直言.验证和填充基于 XML 的默认值在 Python 中的 XSD 上

Since during validation my XML is checked against XSD, the validation engine knows types of all nodes. I had a similar question 7 month ago which is still with no answer. They are related, imho. Validating and filling default values in XML based on XSD in Python

推荐答案

你是对的,验证器必须知道它验证的所有元素和属性的类型关联,因此验证器能够提供访问到那个信息.

You're right that the validator must know the type associations of all the elements and attributes it validates, and that the validator is thus in a position to provide access to that information.

然而,无论好坏,调用者和验证者之间的 API 以及调用者可用的验证相关信息的选择都是完全实现定义的.一些验证器(Xerces J 是一个显着的例子)提供非常全面的验证信息;其他人没有.

For better or worse, however, both the API between caller and validator and the selection of validation-related information available to the caller is completely implementation-defined. Some validators (Xerces J is a notable example) make a very full range of validation information available; others don't.

在不知道您使用的是哪种验证器的情况下,没有人可以肯定地告诉您您正在寻找的类型信息是否可用.既然你在调用验证器,那么肯定有一个 API;如果类型关联可通过 API 获得,大概文档会这样说.如果 API 不提供对它的访问,可能是因为底层架构验证器不提供对信息的访问,也可能是因为 API 的创建者没有看到这一点;您的工作(如果您想进一步研究)将是找出哪一种情况是这样,然后尝试说服相关方让相关方相信提供这些信息是有用的.

Without knowing what validator you are using, no one can tell you with certainty whether the type information you're seeking is available. Since you're calling the validator, there must be an API; if type associations are available through the API, presumably the documentation will say so. If the API doesn't provide access to it, it may be because the underlying schema validator doesn't provide access to the information, or it may be because the creator of the API didn't see the point; your job (if you want to pursue this further) will be to find out which of those is the case and then try to persuade the relevant parties that it would be useful to make the information available.

如果您无法通过 API 访问信息,您可以使用 David W 的另一个答案中提到的更复杂版本的方法来帮助自己.XSD 模式的一个属性是任何元素严格来说都是从验证根到该元素的路径的函数,因此原则上很简单(如果在实践中有点乏味),对于文档实例中的任何元素,它的管理类型将是什么如果文档实例针对特定模式进行验证.例如,对于您提到的情况,很容易判断给定的 wakeupNote 是否具有 deviceDescriptionotherElement 作为祖先,或者哪个是如果 wakeupNote 两者都具有,则更近的祖先,并根据该知识推断适当的管理类型定义.

If you have no luck with getting access to the information through the API, you can help yourself with a more sophisticated version of the approach mentioned in another answer by David W. It is a property of XSD schemas that the governing type of any element is strictly a function of the path to that element from the validation root, so it is straightforward in principle (if more than a bit tedious in practice) to identify, for any element in a document instance, what its governing type will be if the document instance is validated against a particular schema. For the case you mention, for example, it is straightforward to tell whether a given wakeupNote has deviceDescription or otherElement as an ancestor, or which is the nearer ancestor if the wakeupNote has both, and to infer the appropriate governing type definition based on that knowledge.

以这种方式帮助自己可能需要大量的工作.如果有通用工具来计算这些信息并使其以各种形式访问会有所帮助,但如果有的话,我不知道它们.(我知道有人可以付费构建这样的工具.)所以如果我是你,我会先尝试通过 API 获取信息.

Helping yourself in this way is likely to require a non-trivial amount of work. It would help if there were general-purpose tools to calculate this information and make it accessible in various forms, but if there are any such, I don't know about them. (I do know people who could build such a tool for a fee.) So if I were you I'd try to get the information through the API first.

这篇关于是否可以获得 XSD 中定义的 XML 节点的类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆