使用三个xml架构作为lxml中的一个组合架构进行验证? [英] Validate with three xml schemas as one combined schema in lxml?

查看:73
本文介绍了使用三个xml架构作为lxml中的一个组合架构进行验证?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在生成一个XML文档,其中为不同的部分提供了不同的XSD(也就是说,某些元素的定义在某些文件中,其他元素的定义在其他文件中).

I am generating an XML document for which different XSDs have been provided for different parts (which is to say, definitions for some elements are in certain files, definitions for others are in others).

XSD文件没有相互引用.架构是:

The XSD files do not refer to each other. The schemas are:

  1. http://xmlgw.companieshouse.gov. uk/v2-1/schema/Egov_ch-v2-0.xsd
  2. http://xmlgw.companieshouse. gov.uk/v1-1/schema/forms/FormSubmission-v1-1.xsd
  3. http://xmlgw.companieshouse. gov.uk/v1-1/schema/forms/CompanyIncorporation-v1-2.xsd
  1. http://xmlgw.companieshouse.gov.uk/v2-1/schema/Egov_ch-v2-0.xsd
  2. http://xmlgw.companieshouse.gov.uk/v1-1/schema/forms/FormSubmission-v1-1.xsd
  3. http://xmlgw.companieshouse.gov.uk/v1-1/schema/forms/CompanyIncorporation-v1-2.xsd

是否有一种方法可以使用lxml针对所有模式验证文档?

Is there a way to validate the document against all of the schemas using lxml?

这里的解决方案不是不是,只是针对每个模式进行单独验证,因为我遇到的问题是由于XSD中未指定元素,导致验证失败.例如,在针对http://xmlgw.companieshouse.gov.uk/v2-1/schema/Egov_ch-v2-0.xsd进行验证时,出现错误:

The solution here is not simply to validate individually against each schema, because the problem I am having is that validation fails because of elements not specified in the XSD. For example, when validating against http://xmlgw.companieshouse.gov.uk/v2-1/schema/Egov_ch-v2-0.xsd, I get the error:

  File "lxml.etree.pyx", line 3006, in lxml.etree._Validator.assertValid (src/lxml/lxml.etree.c:125415)
DocumentInvalid: Element '{http://xmlgw.companieshouse.gov.uk}CompanyIncorporation': No matching global element declaration available, but demanded by the strict wildcard., line 9

因为所讨论的文档包含一个{http://xmlgw.companieshouse.gov.uk}CompanyIncorporation元素,该元素未在要验证的XSD中指定,而是在其他XSD文件之一中指定.

Because the document in question contains a {http://xmlgw.companieshouse.gov.uk}CompanyIncorporation element, which is not specified in the XSD being validated against, but in one of the other XSD files.

推荐答案

我相信您应该只针对Egov_ch-v2-0.xsd进行验证,该文件似乎定义了信封文档. (这是您正在创建的文档,对吗?您没有显示XML.)

I believe you should only be validating against Egov_ch-v2-0.xsd, which appears to define an envelope document. (This is the document you are creating, right? You haven't showed your XML.)

此架构使用<xs:any namespace="##any" minOccurs="0"/>定义信封的正文内容.但是,xsd:any并非 的意思是忽略所有内容".而是表示在这里接受任何内容".验证还是忽略内容由processContents属性控制,该属性默认为strict.这意味着在这里必须发现的任何元素都必须针对该模式可用的类型进行验证.但是,Egov_ch-v2-0.xsd不会导入CompanyIncorporation-v1-2.xsd,因此它不知道CompanyIncorporation元素,因此该文档无法验证.

This schema uses <xs:any namespace="##any" minOccurs="0"/> to define body contents of the envelope. However, xsd:any does not mean "ignore all contents." Rather it means "accept anything here." Whether to validate or ignore the contents is controlled by the processContents attribute, which defaults to strict. This means that any elements discovered here must validate against types available to the schema. However, Egov_ch-v2-0.xsd does not import CompanyIncorporation-v1-2.xsd, so it doesn't know about the CompanyIncorporation element, so the document does not validate.

您需要在主模式(Egov_ch-v2-0.xsd)中添加xsd:import元素,以导入文档中可能使用的所有其他模式.您可以在xsd文件本身中执行此操作,也可以在解析后以编程方式添加元素:

You need to add xsd:import elements to your main schema (Egov_ch-v2-0.xsd) to import all other schemas that may be used in the document. You can either do this in the xsd file itself, or you can add the elements programmatically after parsing:

xsd = lxml.etree.parse('http://xmlgw.companieshouse.gov.uk/v2-1/schema/Egov_ch-v2-0.xsd')
newimport = lxml.etree.Element('{http://www.w3.org/2001/XMLSchema}import',
    namespace="http://xmlgw.companieshouse.gov.uk",
    schemaLocation="http://xmlgw.companieshouse.gov.uk/v1-1/schema/forms/CompanyIncorporation-v1-2.xsd")
xsd.getroot().append(newimport)

validator = lxml.etree.XMLSchema(xsd)

您甚至可以使用一种通用方法来执行此操作,该函数采用模式路径列表,并返回通过解析targetNamespace设置了namespaceschemaLocationxsd:import语句列表.

You can even do this in a generic way with a function that takes a list of schema paths and returns a list of xsd:import statements with namespace and schemaLocation set by parsing targetNamespace.

(顺便说一句,您可能应该下载这些模式文档,并使用文件系统路径引用它们,而不是通过网络加载它们.)

(As an aside, you should probably download these schema documents and reference them with filesystem paths rather than load them over the network.)

这篇关于使用三个xml架构作为lxml中的一个组合架构进行验证?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆