自动XSD验证 [英] Automatic XSD validation

查看:71
本文介绍了自动XSD验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据lxml文档根据已解析文档的DOCTYPE自动检索DTD.您所要做的就是使用启用了DTD验证的解析器."

According to the lxml documentation "The DTD is retrieved automatically based on the DOCTYPE of the parsed document. All you have to do is use a parser that has DTD validation enabled."

http://lxml.de/validation.html#validation-at-解析时间

但是,如果要针对XML模式进行验证,则需要显式引用一个.

However, if you want to validate against an XML schema, you need to explicitly reference one.

我想知道为什么会这样,并且想知道是否存在可以执行此操作的库或函数.甚至是如何自己实现的解释.问题在于,似乎有很多方法可以引用XSD,而我需要支持所有这些方法.

I am wondering why this is and would like to know if there is a library or function that can do this. Or even an explanation of how to make this happen myself. The problem is there seems to be many ways to reference an XSD and I need to support all of them.

验证不是问题.问题是如何确定要验证的架构.理想情况下,它也可以处理内联模式.

Validation is not the issue. The issue is how to determine the schemas to validate against. Ideally this would handle inline schemas as well.

更新:

这里是一个例子.

simpletest.xsd:

simpletest.xsd:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="name" type="xs:string"/>
</xs:schema>

simpletest.xml:

simpletest.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<name xmlns="http://www.example.org"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.example.org simpletest.xsd">foo</name>

我想做以下事情:

>>> parser = etree.XMLParser(xsd_validation=True)
>>> tree = etree.parse("simpletest.xml", parser)

推荐答案

我有一个项目,该项目具有100多种不同的模式和xml树.为了管理所有这些并验证它们,我做了几件事.

I have a project that has over 100 different schemas and xml trees. In order to manage all of them and validate them i did a few things.

1)我创建了一个文件(即xmlTrees.py),在其中创建了每个xml的字典以及与之关联的相应架构以及xml路径.这使我有一个地方可以同时获取xml&用于验证该xml的架构.

1) I created a file (i.e. xmlTrees.py) where i created a dictionary of every xml and corresponding schema associated with it, and the xml path. This allowed me to have a single place to get both xml & the schema used to validate that xml.

MY_XML = {'url':'/pathToTree/myTree.xml', 'schema':'myXSD.xsd'}

2)在项目中,我们拥有同等数量的名称空间(很难管理).因此,我再次做了一个单独的文件,其中包含所有格式为lxml likes的名称空间.然后在我的测试和脚本中,我总是总是通过名称空间的超集.

2) In the project we have equally as many namespaces (very hard to manage). So what i did was again i created a single file that contained all the namespaces in the format lxml likes. Then in my tests and scripts i would just always pass the superset of namespaces.

ALL_NAMESPACES = {
    'namespace1':  'http://www.example.org',
    'namespace2':  'http://www.example2.org'
}

3)为了进行基本/通用验证,我最终创建了一个我可以调用的基本函数:

3) For basic/generic validation i ended up creating a basic function i could call:

    def validateXML(content, schemaContent):

    try:
        xmlSchema_doc = etree.parse(schemaContent);
        xmlSchema = etree.XMLSchema(xmlSchema_doc);
        xml = etree.parse(StringIO(content));
    except:
        logging.critical("Could not parse schema or content to validate xml");
        response['valid'] = False;
        response['errorlog'] = "Could not parse schema or content to validate xml";

    response = {}
    # Validate the content against the schema.
    try:
        xmlSchema.assertValid(xml)
        response['valid'] = True
        response['errorlog'] = None
    except etree.DocumentInvalid, info:
        response['valid'] = False
        response['errorlog'] = xmlSchema.error_log

    return response

基本上,任何想要使用此功能的函数都需要将xml内容和xsd内容作为字符串发送.这为我提供了最大的灵活性.然后,我只是将此函数放置在一个文件中,该文件具有我所有的xml helper函数.

basically any function that wants to use this needs to send the xml content and the xsd content as strings. This provided me with the most flexability. I then just placed this function in a file where i had all my xml helper functions.

这篇关于自动XSD验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆