模式中的内容模型歧义 [英] Content model ambiguity in a schema

查看:14
本文介绍了模式中的内容模型歧义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

也许我盯着这个问题看得太久了,也许没有答案;不管怎样,我现在在这里.

Maybe I've been staring at this problem for too long, maybe there isn't an answer; either way I'm here now.

我试图在 XSD 中允许一组可能的组合,但我似乎无法找到一种不会导致歧义的方法.

I'm trying to permit a set of possible combinations in an XSD, but I can't seem to find an approach that doesn't result in ambiguity.

快速正则表达式:

foo+ ( bar baz* | bar? baz+ qux* )

  • foo 是必需的(一个或多个)
  • 如果 bar 存在,baz 是可选的(零个或多个)
  • 如果 baz 存在,bar 是可选的(零或一),qux 是可选的(零个或多个)
  • qux 如果baz 不存在,则不能存在
    • foo is required (one-or-more)
    • If bar exists, baz is optional (zero-or-more)
    • If baz exists, bar is optional (zero-or-one) and qux is optional (zero-or-more)
    • qux can not exist if baz does not exist
    • 由于 foo bar baz 导致歧义.

      不明确的 XSD 文档:

      <xs:element name="parent">
          <xs:complexType>
              <xs:sequence>
                  <xs:element name="foo" minOccurs="1" maxOccurs="unbounded" />
                  <xs:choice>
                      <xs:sequence>
                          <xs:element name="bar" minOccurs="1" maxOccurs="1" />
                          <xs:element name="baz" minOccurs="0" maxOccurs="unbounded" />
                      </xs:sequence>
                      <xs:sequence>
                          <xs:element name="bar" minOccurs="0" maxOccurs="1" />
                          <xs:element name="baz" minOccurs="1" maxOccurs="unbounded" />
                          <xs:element name="qux" minOccurs="0" maxOccurs="unbounded" />
                      </xs:sequence>
                  </xs:choice>
              </xs:sequence>
          </xs:complexType>
      </xs:element>
      

      良好测量的屏幕截图:

      现在,我开始意识到这可能只是 XSD 内容模型的一个限制.模棱两可的原因是显而易见的;解决办法不是这样.

      Now, I'm beginning to realize that perhaps this is simply a constraint of the XSD content model. The reason for ambiguity is obvious; the solution not so.

      任何人都可以看到我可以允许这样做的方法吗?通过对元素重新排序,通过使用某种模式设计模式来缓解这种模棱两可的场景?

      Can anyone see a means with which I can permit this; by re-ordering the elements, through use of some schema design pattern to alleviate ambiguous scenarios like this?

      barbaz 的条件依赖显然是问题所在,但我想不出任何其他方法来做到这一点.

      The condition dependency of bar and baz is clearly the problem, but I can't think of any other way to do this.

      在此先感谢各位.

      目前正在阅读Schema Component Constraint: Unique Particle Attribution" 试图找到一个漏洞.欢迎任何其他建议阅读.

      Currently reading "Schema Component Constraint: Unique Particle Attribution" in an attempt to find a loop-hole. Any other suggested reading welcome.

      推荐答案

      IIRC 在计算机科学中有一个定理,它说每一个二义性语法都可以重写为一个非二义性语法,所以首先假设它是可能的.然而,明确的语法有时可能非常复杂.

      IIRC there is a theorem in computer science that says every ambiguous grammar can be rewritten as an unambiguous grammar, so start with the hypothesis that it's possible. However, the unambiguous grammar can sometimes be hideously complex.

      我认为处理这个问题的一个好方法是绘制语法的铁路图",即有限状态机及其转换.然后,当您在这台机器中找到一个状态,该状态具有两个标记为相同符号的转换时,您需要构建一个接受这两个转换的新状态,依此类推.在 CS 文献中,这种算法被称为确定化".

      I think a good approach to handling this is to draw the "railroad diagram" of the grammar, that is, the finite state machine with its transitions. Then when you find a state in this machine that has two transitions labelled with the same symbol, you need to construct a new state that accepts both those transitions, and so on. In the CS literature this algorithm is called "determinization".

      另一种在没有白板的情况下更容易解释的方法是首先找出您选择的两个分支之间的共同点.当您点击内容中的第一个元素时,它必须是 bar 或 baz.所以写两个选择,一个以 bar 开头,一个以 baz 开头.

      Another approach which is perhaps easier to explain without a whiteboard is to start by factoring out what is common between the two branches of your choice. When you hit the first element in the content, it has to be either a bar or a baz. So write two choices, one starting with bar and one with baz.

      据我所知,您的内容模型等同于无歧义模型

      As far as I can see, your content model is euiqvalent to the unambiguous model

      (bar, (baz+, qux*)?) | (baz+, qux*)
      

      但我会仔细检查...

      这篇关于模式中的内容模型歧义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆