如何在没有复杂的手动编辑的情况下将任何正则表达式转换为自身的补充? [英] How do I turn any regex into an complement of itself without complex hand editing?

查看:29
本文介绍了如何在没有复杂的手动编辑的情况下将任何正则表达式转换为自身的补充?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是伪示例,不是真正的正则表达式,但仍然是我的意思的示例:

The following are pseudo examples, not real regex, but still an example of what I mean:

.* (anything)

-.* (NOT anything)

<小时>

[A-Z] (Any letter A to Z, caps only)

-[A-Z] (NOT any letter A to Z, caps only)

<小时>

在问题中将逆变为补.这里是进行更改的地方:将任何正则表达式转换为自身的补充"


Changed inverse into complement in the question. Here's where the change was made: "turn any regex into an complement of itself "

推荐答案

首先,我相信你的意思是正则表达式的补码,而不是它的逆.正则表达式的逆没有多大意义;但是如果将其视为一个函数,我想您可以说匹配器的逆是生成所有匹配字符串的生成器 - 或其他东西.另一方面,语言的补语是所有那些不是原始语言的字符串.

First of all, I believe you mean the complement of a regular expression, not it's inverse. The inverse of a regular expression doesn't make much sense; but if viewed as a function, I suppose you could say that the inverse of the matcher is the generator which generates all matching strings - or something. On the other hand, the complement of a language is all those strings not in the original language.

那么,这里有两个观点需要考虑:

Then, there are two views to consider here:

正则语言的补语是正则的.这意味着可以为补集生成一个接受 DFA(这样做非常简单,实际上:只需将非接受状态集与接受状态集交换即可).任何这样的 DFA 都可以表示为正则表达式 - 因此原则上您确实可以制作这样的正则表达式.

The complement of a regular language is regular. That means it's possible to generate an accepting DFA for the complement (and doing so is very simple, actually: just swap the non-accepting state set with the accepting state set). Any such DFA can be expressed as a regular expression - so in principle you can indeed make such a regex.

请参阅关于常规语言的维基百科文章作为起点.

See the wikipedia article on Regular Languages as a starting point.

当今大多数现代语言中使用的典型 perl 兼容正则表达式语法没有补码运算符.对于完整正则表达式,您可以通过使用负前瞻运算符得到类似的结果:(?!X) 将在 X 时精确匹配字符串将不会.但是,这是对补运算符的糟糕替代,因为您将无法以通常的方式将其用作更大的正则表达式的一部分;此正则表达式不会消耗"输入,这意味着它与其他运算符结合时的行为会有所不同.

The typical perl-compatible regex syntax used in most modern languages nowadays does not have a complementation operator. For a complete regex, you can get something similar by using the negative lookahead operator: (?!X) will match a string precisely when X will not. However, this is a poor replacement for complement operator as you will not be able to use it as a part of a larger regex in the usual fashion; this regex doesn't "consume" input which means it behaves differently in conjunction with other operators.

例如,如果您将数字字符串匹配为 [0-9]*,要匹配整个字符串,您需要添加 ^ 并附加 $,但要使用这种技术来查找补码,您需要编写 ^(?!^[0-9]*$).*$ - 以及此类的通常串联据我所知,否定的正则表达式是不可撤销的.

For example, if you match numeric strings as [0-9]*, to match the entire string you'd prepend ^ and append $, but to use this technique to find the complement you'd need to write ^(?!^[0-9]*$).*$ - and the usual concatenation of such a negated regex is, as far as I can tell, undoable.

有点讽刺的是,由于反向引用,正则表达式的实际化身理论上更强大,但实际上不太灵活,因为该语言不能完全表达补码和交集操作很容易.

Somewhat ironically, the practical incarnation of regexes is theoretically more powerful due to backreferences, but practically less flexible since the language can't quite express the complement and intersection operations easily.

这篇关于如何在没有复杂的手动编辑的情况下将任何正则表达式转换为自身的补充?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆