在Python中使用lxml进行Schematron验证:如何检索验证错误? [英] Schematron validation with lxml in Python: how to retrieve validation errors?

查看:66
本文介绍了在Python中使用lxml进行Schematron验证:如何检索验证错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用lxml进行一些Schematron验证.对于我正在处理的特定应用程序,重要的是要报告任何未通过验证的测试. lxml文档提到了validation_report属性对象的存在.我认为应该包含我要查找的信息,但我只是想不通如何使用它.这是一些演示我的问题的示例代码(改编自 http://lxml.de/validation.html#id2 ;已在Python 2.7.4上进行了测试):

I'm trying to do some Schematron validation with lxml. For the specific application I'm working at, it's important that any tests that failed the validation are reported back. The lxml documentation mentions the presence of the validation_report property object. I think this should contain the info I'm looking for, but I just can't figure out how work with it. Here's some example code that demonstrates my problem (adapted from http://lxml.de/validation.html#id2; tested with Python 2.7.4):

import StringIO
from lxml import isoschematron
from lxml import etree

def main():

    # Schema
    f = StringIO.StringIO('''\
    <schema xmlns="http://purl.oclc.org/dsdl/schematron" >
    <pattern id="sum_equals_100_percent">
    <title>Sum equals 100%.</title>
    <rule context="Total">
    <assert test="sum(//Percent)=100">Sum is not 100%.</assert>
    </rule>
    </pattern>
    </schema>
    ''')

    # Parse schema
    sct_doc = etree.parse(f)
    schematron = isoschematron.Schematron(sct_doc, store_report = True)

    # XML to validate - validation will fail because sum of numbers
    # not equal to 100 
    notValid = StringIO.StringIO('''\
        <Total>
        <Percent>30</Percent>
        <Percent>30</Percent>
        <Percent>50</Percent>
        </Total>
        ''')
    # Parse xml
    doc = etree.parse(notValid)

    # Validate against schema
    validationResult = schematron.validate(doc)

    # Validation report (assuming here this is where reason 
    # for validation failure is stored, but perhaps I'm wrong?)
    report = isoschematron.Schematron.validation_report

    print("is valid: " + str(validationResult))
    print(dir(report.__doc__))

main()

现在,从validationResult的值中,我可以看到验证失败(按预期),所以接下来我想知道为什么.第二个打印语句的结果是:

Now, from the value of validationResult I can see that the validation failed (as expected), so next I would like to know why. The result of the second print statement gives me:

['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__
format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__get
slice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mo
d__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
 '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook
__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center',
 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index
', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper',
'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', '
rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', '
strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

根据文档和,据我所知相关问题.可能是我所忽略的真正明显的东西吗?

Which is about as far as I'm getting, based on the documentation and this related question. Could well be something really obvious I'm overlooking?

推荐答案

好,因此Twitter上的某人给了我一个建议,使我意识到我错误地错误地引用了schematron类.由于似乎没有明确的示例,因此我将在下面分享我的工作解决方案:

OK, so someone on Twitter gave me a suggestion which made me realise that I mistakenly got the reference to the schematron class all wrong. Since there don't seem to be any clear examples, I'll share my working solution below:

import StringIO
from lxml import isoschematron
from lxml import etree

def main():
    # Example adapted from http://lxml.de/validation.html#id2

    # Schema
    f = StringIO.StringIO('''\
    <schema xmlns="http://purl.oclc.org/dsdl/schematron" >
    <pattern id="sum_equals_100_percent">
    <title>Sum equals 100%.</title>
    <rule context="Total">
    <assert test="sum(//Percent)=100">Sum is not 100%.</assert>
    </rule>
    </pattern>
    </schema>
    ''')

    # Parse schema
    sct_doc = etree.parse(f)
    schematron = isoschematron.Schematron(sct_doc, store_report = True)

    # XML to validate - validation will fail because sum of numbers 
    # not equal to 100 
    notValid = StringIO.StringIO('''\
        <Total>
        <Percent>30</Percent>
        <Percent>30</Percent>
        <Percent>50</Percent>
        </Total>
        ''')
    # Parse xml
    doc = etree.parse(notValid)

    # Validate against schema
    validationResult = schematron.validate(doc)

    # Validation report 
    report = schematron.validation_report

    print("is valid: " + str(validationResult))
    print(type(report))
    print(report)

main()

报表上的 print 语句现在产生以下输出:

The print statement on the report now results in the following output:

 <?xml version="1.0" standalone="yes"?>
<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:schold="http://www.ascc.net/xml/schematron" xmlns:sch="http://www.ascc.net/xml/schematron" xmlns:iso="http://purl.oclc.org/dsdl/schematron" title="" schemaVersion="">
  <!--   
           
           
         -->
  <svrl:active-pattern id="sum_equals_100_percent" name="Sum equals 100%."/>
  <svrl:fired-rule context="Total"/>
  <svrl:failed-assert test="sum(//Percent)=100" location="/Total">
    <svrl:text>Sum is not 100%.</svrl:text>
  </svrl:failed-assert>
</svrl:schematron-output>

这正是我想要的!

这篇关于在Python中使用lxml进行Schematron验证:如何检索验证错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆