如何判断一个正则表达式是否匹配另一个正则表达式的子集? [英] How to tell if one regular expression matches a subset of another regular expression?

查看:110
本文介绍了如何判断一个正则表达式是否匹配另一个正则表达式的子集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是想知道是否可以使用一个正则表达式来匹配另一个,这是某种:

I'm just wondering if it's possible to use one regular expression to match another, that is some sort of:

['a-z'].match(['b-x'])
True

['m-n'].match(['0-9'])
False

正则表达式完全可以实现这种事情吗?我正在用 python 工作,所以任何特定于 re 模块实现的建议都会有所帮助,但我会采取任何我能得到的关于正则表达式的东西.

Is this sort of thing possible with regex at all? I'm doing work in python, so any advice specific to the re module's implementation would help, but I'll take anything I can get concerning regex.

好的,一些澄清显然是有序的!我当然知道正常的匹配语法看起来像这样:

Ok, some clarification is obviously in order! I definitely know that normal matching syntax would look something like this:

expr = re.compile(r'[a-z]*')
string = "some words"
expr.match(string)
<sRE object blah blah>

但我想知道正则表达式是否有能力在我试图用上面解释的非语法正确版本中匹配其他不太具体的表达式,来自 bx 的任何字母将始终是任何字母的子集(匹配)从 az.我只是通过尝试知道这不是您可以通过调用一个编译表达式与另一个编译表达式的匹配来完成的事情,但问题仍然存在:这可能吗?

but I'm wondering if regular expressions have the capability to match other, less specific expressions in the non-syntacticly correct version I tried to explain with above, any letter from b-x would always be a subset (match) of any letter from a-z. I know just from trying that this isn't something you can do by just calling the match of one compiled expression on another compiled expression, but the question remains: is this at all possible?

如果还不清楚,请告诉我.

Let me know if this still isn't clear.

推荐答案

我认为 —理论上要判断正则表达式 A 是否匹配正则表达式 B 匹配的子集,算法可以:

I think — in theory — to tell whether regexp A matches a subset of what regexp B matches, an algorithm could:

  1. 计算B 和联合"A|B 的最小确定性有限自动机.
  2. 检查两个 DFA 是否相同.当且仅当 A 匹配 B 匹配内容的子集时,情况才成立.
  1. Compute the minimal Deterministic Finite Automaton of B and also of the "union" A|B.
  2. Check if the two DFAs are identical. This is true if and only if A matches a subset of what B matches.

然而,在实践中做到这一点可能是一个重大项目.有从正则表达式构建最小状态DFA等解释strong> 但他们只倾向于考虑数学上纯正则表达式.您还必须处理 Python 为方便而添加的扩展.此外,如果任何扩展导致语言不规则(我不确定是否是这种情况),您可能无法处理这些扩展.

However, it would likely be a major project to do this in practice. There are explanations such as Constructing a minimum-state DFA from a Regular Expression but they only tend to consider mathematically pure regexps. You would also have to handle the extensions that Python adds for convenience. Moreover, if any of the extensions cause the language to be non-regular (I am not sure if this is the case) you might not be able to handle those ones.

但是你想做什么?也许有更简单的方法......?

But what are you trying to do? Perhaps there's an easier approach...?

这篇关于如何判断一个正则表达式是否匹配另一个正则表达式的子集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆