如何基于给定的正则表达式构造CFG [英] How to construct a CFG based on a given regular expression

查看：372 发布时间：2020/6/3 20:49:40 regex algorithm context-free-grammar

本文介绍了如何基于给定的正则表达式构造CFG的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图弄清楚如何基于给定的正则表达式构造CFG（上下文无关文法）。
例如，a（ab）*（a | b）
我认为有一种算法可以通过，但这确实令人困惑。
这是到目前为止我得到的：

I am trying to figure out how to construct a CFG (context free grammar) based on a given regular expression. For example, a(ab)*(a|b) I think there is an algorithm to go through, but it is really confusing. here is what i got so far:

    S->aAB; 
    A->aAb|empty;
    B->a|b;

这看起来正确吗？
任何帮助将不胜感激。

Does this look right? Any help would be appreciated.

推荐答案

为给定正则表达式构造上下文无关语法的另一种方法是：

Another way to construct a context-free grammar for a given regular expression is:

构造一个接受与正则表达式相同语言的有限状态机。

创建一种语法，其末端为正则表达式的字母，其非末端为（或对应于1：1）状态机中的状态，并且其规则的格式为 X- >对于终端符号t上从状态X到状态Y的每个状态机转换，t Y 。如果您的CFG表示法允许，则每个最终状态F都将获得 F-> epsilon 。如果您的CFG表示法不允许使用此类规则，则对于在终端t上从状态X到最终状态F的每次转换，请生成规则 X-> t （除了已经描述的规则 X-> t F ）。结果是一个规则语法，这是一个无上下文的语法，服从于每个右侧最多具有一个非终结符的附加约束。

Construct a finite state machine which accepts the same language as the regular expression.
Create a grammar whose terminals are those in the alphabet of the regular expression, whose non-terminals are (or correspond 1:1 to) the states in the state machine, and which has a rule of the form X -> t Y for every state-machine transition from state X to state Y on terminal symbol t. If your CFG notation allows it, each final state F gets a rule of the form F -> epsilon. If your CFG notation doesn't allow such rules, then for each transition from state X to final state F on terminal t, produce the rule X -> t (in addition to the rule X -> t F already described). The result is a regular grammar, a context-free grammar that obeys the additional constraint that each right-hand side has at most one non-terminal.

对于给出的示例，假定我们构造以下FSA（在许多接受与正则表达式相同语言的语言中）：

For the example given, assume we construct the following FSA (of the many that accept the same language as the regular expression):

从中可以直接得出以下内容常规语法：

From this, it is straightforward to derive the following regular grammar:

S -> a A1
A1 -> a A2
A2 -> b B3
B3 -> a A2
B3 -> a A4
B3 -> b B5
A1 -> a A4
A1 -> b B5
A4 -> epsilon
B5 -> epsilon
epsilon ->

或者，如果我们不希望规则的右侧为空，则删除最后三个

Or, if we don't want rules with an empty right-hand side, drop the last three rules of that grammar and add:

A1 -> a
A1 -> b
B3 -> a
B3 -> b

与其他方法相比，该方法的缺点是生成的语法过于冗长确实如此，并且推导可以完全是机械的，这意味着无需费力思考就更容易正确。

Compared with other approaches, this method has the disadvantage that the resulting grammar is more verbose than it needs to be, and the advantage that the derivation can be entirely mechanical, which means it's easier to get right without having to think hard.

这篇关于如何基于给定的正则表达式构造CFG的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何基于给定的正则表达式构造CFG [英] How to construct a CFG based on a given regular expression

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何基于给定的正则表达式构造CFG [英] How to construct a CFG based on a given regular expression

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭