由(a * + b *)生成的字符串的类型是什么 [英] What are the type of Strings generated by (a*+b*)

查看:156
本文介绍了由(a * + b *)生成的字符串的类型是什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

除了正常表达式(a * + b *)包含任何数量的a和b的字符串,如aa ..或bb ..,正则表达式(a * + b *)包含一个字符串

Besides strings of any number of a's and b's like aa.. or bb.. ,Would regular expression (a*+b*) contain a string like


ab

ab

或以b结尾的任何字符串

or any string ending with b ?

(a * + b *)和(a * b *)一样吗?

Is (a*+b*) same as (a* b*) ?

我有点混淆正则表达式生成的字符串

I am a little bit confuse about the strings generated by regular expression (a*+b*) and would really appreciate if someone can help.

推荐答案

除非你使用的regex语言明确将 * + 分类为具有特殊含义的特殊标记,或者保留用于未来的扩展(现在产生已定义的行为或语法错误),自然语法分析(a *)+ :后缀 + / code>应用于表达式 a *

Unless you're working with a regex language which explicitly classifies the *+ as a special token which either has a special meaning, or is reserved for future extension (and produces defined behavior now, or a syntax error), the natural parse of a*+ is that it means (a*)+: the postfix + is applied to the expression a*.

观察(a *)+ 相当于 a * 。因此, a * + b * a * b * 相同。

If that interpretation applies, next we can observe that (a*)+ is equivalent to just a*. Therefore a*+b* is the same as a*b*.

首先,根据定义 R + 表示 RR * 。匹配一个 R ,然后零个或多个。因此,我们可以将(a *)+ 重写为(a *)(a *)*

Firstly, by definition R+ means RR*. Match one R and then zero or more of them. Therefore, we can rewrite (a*)+ as (a*)(a*)*.

其次, * 是等幂的,因此(a *)* is is just (a *)。如果我们匹配零个或多个 a ,零次或多次,没有任何变化;净效果为零或更多 a 证明 R * 表示此无限扩展:(| R | RR | RRR | RRRR | RRRRR | ):匹配一个 R ,或匹配两个 R ...因此,(a *)* dentes这种扩展:(| a * | a * a * | a * a * a * | ...)。这些内部 a * -s表示单个二级扩展:(|(| a | aa | aaa | ... | (| a | aa | aaa | ...)(a | a | aaa | ...))| ...)。通过分支 | 的关联属性,我们可以展开像(a |(b | c))(a | b | c),当我们这样做扩展时,我们注意到有很多相同的术语—空正则表达式,单个 a ,双 aa 等。这些都减少为一个副本,因为(|||)等效于()(a | a | a | a | ...)等效于(a)等。也就是说,当我们通过增加长度来排序这些术语,并且将多个相同的术语压缩成一个副本时,我们最终得到(| a | aa | aaa | aaaa | ...) code>,可以识别为 a * 的扩展。因此(a *)* a *

Secondly, * is idempotent, so (a*)* is is just (a*). If we match "zero or more a", zero or more times, nothing changes; the net effect is zero or more a. Proof: R* denotes this infinite expansion: (|R|RR|RRR|RRRR|RRRRR|...): match nothing, or match one R, or match two R's, ... Therefore, (a*)* dentes this expansion: (|a*|a*a*|a*a*a*|...). These inner a*-s in turn denote individual second-level expansions: (|(|a|aa|aaa|...|)|(|a|aa|aaa|...)(a|a|aaa|...))|...). By the associative property of the branch |, we can flatten a structure like (a|(b|c)) into (a|b|c), and when we do this to the expansion, we note that there are numerous identical terms—the empty regex (), the single a, the double aa and so on. These all reduce to a single copy, because (|||) is equivalent to () and (a|a|a|a|...) is equivalent to just (a) and so on. That is to say, when we sort the terms by increasing length, and squash multiple identical terms to just one copy, we end up with (|a|aa|aaa|aaaa|...), which is recognizable as the expansion of just a*. Thus (a*)* is a*.

最后,(a *)(a *)只是指 a * 证明:与上一个类似,我们扩展到分支:(| a | aa | aaa | ...)(| a | aa | aaa | ...) / code>。接下来我们注意到,分支表达式的链接相当于术语的笛卡尔乘积集。也就是说(a | b | c | ..)(i | j | k | ...)意味着: | aj | ik | ... | bi | bj | bk | ... | ci | cj | ck | ... | ...)。当我们将此产品应用于(| a | aa | aaa | ...)(| a | aa | aaa | ...)当安排长度增加和重复删除时,减少到(| a | aa | aaa | aaaa | ...),这只是 a *

Lastly, (a*)(a*) just means a*. Proof: Similarly to the previous, we expand into branches: (|a|aa|aaa|...)(|a|aa|aaa|...). Next we note that the catenation of branch expressions is equivalent to a Cartesian Product set of the terms. That is to say (a|b|c|..)(i|j|k|...) means, precisely: (ai|aj|ik|...|bi|bj|bk|...|ci|cj|ck|...|...). When we apply this product to (|a|aa|aaa|...)(|a|aa|aaa|...) we get a proliferation of terms, which, when arranged in increasing length and subject to de-duplication, reduce to (|a|aa|aaa|aaaa|...), and that is just a*.

这篇关于由(a * + b *)生成的字符串的类型是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆