将上下文无关的语法转换为正则表达式 [英] converting context free grammar into regular expression

查看:350
本文介绍了将上下文无关的语法转换为正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在浏览CFG,看到了答案,我不确定他们是怎么得到的。

I am currently going over CFG and saw the answer and I am not sure how they got it. How did they get it to convert into Regular Expression from CFG here?

S -> aS|bX|a
X -> aX|bY|a
Y -> aY|a


answer:
R.E -> (a*(a+ba*a+ba*ba*a))


推荐答案

您应该学习我在答案中写的基本规则从正则表达式构造等效的正则语法 ,这些规则将帮助您将正则表达式转换为右或左线性语法或右或左线性语法为正则表达式-都。

You should learn the basic rules that I have written in my answer "constructing an equivalent regular grammar from a regular expression", those rules will help you in converting "a regular expression into right or left liner grammar" or "a right or left liner grammar into regular expression" - both.

尽管,一种语言可以有多个正则表达式(和语法/自动机)。下面,我试图解释如何找到教科书中该问题答案的正则表达式。准确地阅读每个步骤并链接答案,以便您下次可以自己学习解决此类问题的方法。

Though, more than one regular expressions (and grammars/automata) can be possible for a language. Below, I have tried to explain how to find regular expression given in answer for the question in your textbook. Read each step precisely and linked answer(s) so that you can learn approaches to solve such questions yourself next time.

第一步,要回答此类问题,您应该清除此语法生成什么语言? (类似地,如果您有自动机,则尝试理解该自动机所代表的语言)。

At first step, to answering such question you should be clear "what does language this grammar generate?" (similarly, if you have an automata then try to understand language represented by that automata).

正如我在链接答案中所说的那样,语法规则如下: S→ eS | e 对应于 plus clouser,并生成字符串 e + 。同样,您有三对这样的规则可以在语法中生成 a +

As I said in linked answer, grammar rules like: S → eS | e are corresponding to "plus clouser" and generates strings e+. Similarly, you have three pairs of such rules to generate a+ in your grammar.

S → aS | a   
X → aX | a  
Y → aY | a    

(注意: a + 也可以写为 a * a aa * –描述一个或多个'a'。)

(Note: a+ can also be written as a*a or aa* – describes one or more 'a'.)

也请注意语法,您没有任何空生产,例如 A→ ∧ ,因此不是变量 S X Y 是可以为空的,表示空字符串不是语法语言的成员,例如:ε ∉ L(G)。

Also notice in grammar, you do not have any "null production" e.g. A → ∧, so non-of the variable S, X or Y are nullable, that implies empty string is not a member of language of grammar, as: ε ∉ L(G).

如果您注意到起始变量的 S 生产规则:

If you notice start-variable's S productions rules:

S → aS | bX | a

然后很明显,字符串ω语言中的语言可以以符号'a'开头,也可以以'b'开头(因为您有两种选择来应用 S 产生式(1) S→ aS | a 给出'a'作为ω中的第一个符号,或者(2) S→ bX 用于生成以符号开头的字符串b')。

Then it is very clear that strings ω in language can either start with symbol 'a' or with 'b' (as you have two choices to apply S productions either (1) S → aS | a that gives 'a' as the first symbol in ω, or (2) S → bX that use to produce strings those start with symbol 'b').

现在,可能的最小长度字符串是多少?在L(G)? –最小长度字符串为 a ,可以使用生产规则: S→一个

Now, what are the possible minimum length strings ω in L(G)? – minimum length string is "a" that is possible using production rule: S → a.

接下来要注意的是 b ∉ L(G),因为如果您的苹果 S→ bX ,然后您必须替换X PLT / PLT2.1.2e.html rel = nofollow noreferrer>句子格式 bX 使用某些 X 的生产规则,并且我们知道 X 也不能为空,因此在'b'之后总会有一些符号 –换句话说,来自 bX 的情感是∣ω∣ ≥ 2.

Next note that "b" ∉ L(G) because if you apples S → bX then later on you have to replace X in sentential form bX using some of X's production rules, and as we know X is also not nullable hence there would be always some symbol(s) after 'b' – in other words sentimental from bX derives ∣ω∣ ≥ 2.

以上讨论的形式很明显,使用 S 生产规则,您可以生成句子形式 a * a a * bX ,分两个步骤:

Form above discussion, it is very clear that using S production rules you can generate sentential forms either a*a or a*bX, in two steps:


  1. 对于 a * 使用 S→ aS 反复出现,将得到 S⇝ a * S (符号⇝表示不止一个步骤)

  1. For a* use S → aS repeatedly that will give S ⇝ a*S (symbol ⇝ means more than one steps)

替换 S S⇝中的$ c> a * S 通过 a * a a * bX

Replace S in rhs of S ⇝ a*S to get either by a*a or a*bX

此外, a * a a * bX 可以写为 S⇝ a *(a + bX) S⇝ (a *(a + bX)),如果您想用括号括住完整的表达式

Also, "a*a or a*bX" can be written as S ⇝ a*(a + bX) or S ⇝ (a*(a + bX)) if you like to parenthesizes complete expression.

现在比较 S X 的生产规则是相同的!因此,如上所示,对于 S ,您还可以为 X 描述它可用于生成句子形式 X⇝ (a *(a + bY))

Now compare production rules of S and X both are the same! So as I shown above for S, you can also describe for X that it can use to generate sentential forms X ⇝ (a*(a + bY)).

要导出答案中给出的正则表达式,将 X 替换为(a *(a + bY)) S⇝中a *(a + bX),您将获得:

To derive the regular expressions given in answer replace X by (a*(a + bY)) in S ⇝ a*(a + bX), you will get:

S ⇝ a*(a + b X )  
S ⇝ a*(a + b (a*(a + bY)) )

现在,最后一个 Y 的生产规则相对简单-仅用于创建 plus clouser a + (或 a * a )。

And now, last Y production rules are comparatively very simple - just use to create "plus clouser" a+ (or a*a).

所以我们也将 Y 替换为 S 衍生的句子形式。

So let's replace Y also in S derived sentential form.

S ⇝ a*(a + b(a*(a + bY)))   
  ⇝ a*(a + b(a*(a + ba*a)))

简化它,将分配低两次删除内部括号并连接正则表达式– P(Q + R)可以写为 PQ + PR 。< sup>✞

Simplify it, apply distribution low twice to remove inner parenthesis and concatenate regular expressions – P(Q + R) can be written as PQ + PR.


  ⇝ a*(a + b(a*(a + ba*a)))     
  ⇝ a*(a + b(a*a + a*ba*a))     
  ⇝ a*(a + ba*a + ba*ba*a)






+在正式语言中以正则表达式形式使用的两种语法(i)+二元运算符的意思是联合运算(ii)+一元上标运算符的意思是– plus clouser

在正则表达式中,编程语言+仅用于 plus clouser

在正则表达式中,我们使用∣联合的符号,但不是完全是联合运算符。并集(A∪ B)与(B∪ A)相同,但在正则表达式(A∣ B)中可能不等于(B∣ A)


: + in regular expression in formal languages use in two syntax (i) + as binary operator means – "union operation" (ii) + as unary superscript operator means – "plus clouser"
: In regex in programming languages + is only uses for "plus clouser"
: In regex we use ∣ symbol for union, but that is not exactly a union operator. In union (A ∪ B) is same as (B ∪ A) but in regex (A ∣ B) may not equals to (B ∣ A)

这篇关于将上下文无关的语法转换为正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆