否定匹配字符串的 XSD 限制 [英] XSD restriction that negates a matching string

查看:38
本文介绍了否定匹配字符串的 XSD 限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望我的 XSD 验证字符串的内容.具体来说,我想验证某个字符串不会出现.

I want my XSD to validate the contents of a string. To be specific, I want to validate that a certain string does not occur.

考虑这个规则,它将验证我的字符串是否出现.查找所有以这个特定字符串开头的 Link 元素:/site/example.com

Consider this rule, which will verify that my string occurs. Looking for all Linkelements starts with this particular string: /site/example.com

<xs:element name="Link" type="xs:normalizedString" minOccurs="0">
  <xs:simpleType>
    <xs:restriction base="xs:token">
      <xs:pattern value="(/site/example\.com).*"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>                 

换句话说,上面的表达式验证所有 Link 元素都以 /site/example.com 开头.你如何反转上面的表达式,以便它**验证没有 Link 元素以 /site/example.com 开头?

In other words, the expression above verifies that all Link elements start with /site/example.com. How do you invert the expression above, so that it **verifies that no Link elements start with /site/example.com?

我尝试了以下正则表达式但没有成功:/[^(site/example\.com)].*,所以这是行不通的:

I tried the following regexp with no luck: /[^(site/example\.com)].*, so this is not working:

无效策略 1(否定单个字符)我知道这可能适用于否定单个字符,因为这个 SO 问题是这样的:XML 模式限制模式不允许空字符串

Not-working strategy 1 (negation of single character) I am aware that this probably would work for negating a single character, since this SO question does that: XML schema restriction pattern for not allowing empty strings

该问题中的建议模式 <xs:pattern value=".*[^\s].*"/>

但是在这种情况下只否定单个字符不起作用,因为它会正确失败:

But negating only a single character does not work in this case, since it would correctly fail:

/site/example.com

/site/example.com

但它也会错误地失败

/解决方案

无效策略 2(高级正则表达式前瞻)根据这个 SO question ( 正则表达式匹配不匹配的行't 包含一个词? ),你可以用否定前瞻(?!expr)来解决这个问题.

Not-working Strategy 2 (advanced regexp lookahead) According to this SO question ( Regular expression to match a line that doesn't contain a word? ), you could solve this with negative lookahead (?!expr).

所以这将适用于普通的正则表达式:

So this will work in ordinary regexp:

^((?!/site/example.com).)*$

^((?!/site/example.com).)*$

不幸的是,xsd 验证仅支持有限的正则表达式.根据此站点,不支持前瞻:regular-expressions.info -- xsd

Now, unfortunately xsd validations support only limited regexps. According to this site, no lookaheads are supported: regular-expressions.info -- xsd

这几乎描述了我到目前为止所尝试的内容.

This pretty much describes what i have tried until now.

我的问题是,如何否定 XSD 架构中的正则表达式?

My question is, how do i negate a regular expression in an XSD schema?

推荐答案

这在 XSD 1.1 中更简单,您可以在 XSD 1.1 中使用断言来确保值不以您指定的字符串开头.但从概念上讲,即使在 XSD 1.0 和简单的正则表达式中也足够简单:您希望确保字符串不以/site/example.com"开头.如果它确实以这种方式开始,您将有关于字符串的一系列事实的逻辑连接:

This is simpler to do in XSD 1.1, where you can use assertions to ensure that the value does not begin with the string you specify. But conceptually, it's simple enough even in XSD 1.0 and simple regular expressions: you want to ensure that the string does not begin with "/site/example.com". If it did begin that way, you'd have a logical conjunction of a series of facts about the string:

  • substring(., 1, 1) = '/'
  • substring(., 2, 1) = 's'
  • substring(., 3, 1) = 'i'
  • ...
  • substring(. 17, 1) = 'm'

你想否定这个事实的结合.现在,根据德摩根定律,~(a and b and ... and z) 等价于 (~a or ~b or ... or ~z).因此,您可以通过编写以下术语的析取来执行所需的操作:

You want to negate this conjunction of facts. Now, by De Morgan's Laws, ~(a and b and ... and z) is equivalent to (~a or ~b or ... or ~z). So you can do what you need by writing a disjunction of the following terms:

    [^/].*
    |.([^s].*)?
    |.{2}([^i].*)?
    |.{3}([^t].*)?
    |.{4}([^e].*)?
    |.{5}([^/].*)?
    |.{6}([^e].*)?
    |.{7}([^x].*)?
    |.{8}([^a].*)?
    |.{9}([^m].*)?
    |.{10}([^p].*)?
    |.{11}([^l].*)?
    |.{12}([^e].*)?
    |.{13}([^\.].*)?
    |.{14}([^c].*)?
    |.{15}([^o].*)?
    |.{16}([^m].*)?

在上面的每个术语中,[^s].* 形式的子表达式都被包裹在 (...)? 中——术语 .{2}([^i].*)? 表示任何以两个字符开头的字符串都可以,如果第三个字符不是 i 或者根本没有第三个字符.这可确保长度小于 17 个字符的字符串不会被排除在外,即使它们恰好是禁用字符串的前缀.

In each term above the subexpression of the form [^s].* has been wrapped in (...)? -- the term .{2}([^i].*)? means any string beginning with two characters is OK if the third character is not an i or if there is no third character at all. This ensures that strings shorter than 17 characters in length are not excluded, even if they happen to be prefixes of the forbidden string.

当然,要在 XSD 架构文档中使用它,您需要删除所有空格,这会使正则表达式更难阅读.

Of course, to use this in an XSD schema document, you will need to remove all the whitespace, which makes the regex harder to read.

[Addition, June 2016] 另见 此相关和更一般的问题.

[Addition, June 2016] See also this related and more general question.

这篇关于否定匹配字符串的 XSD 限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆