计算前导集和尾随集以实现无上下文语法 [英] Computing leading and trailing sets for context-free grammar

查看:69
本文介绍了计算前导集和尾随集以实现无上下文语法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种详细的算法,该算法描述如何在上下文无关的语法中为非终结符生成前导和尾随集.

我发现了这样的东西: https://pl.scribd.com/doc/51358638/16/Operator-优先关系但我不确定它是如何工作的.(请参阅第20页)

假设我们有作品:

A-> YaZ |B

B-> b

然后,据说Leading(A)= {a},Leading(A)= Leading(B)和Leading(B)= {b}.我对此表示怀疑:

  1. 这是否意味着Leading(A)= {a,b}?-我假设在每个步骤中我们都不会替换由于'='而已经生成的集合,而是将两个集合相加.
  2. 这是否意味着Leading(B)是{b}或{a,b}?-第三个规则是意味着我们将Leading(B)添加到Leading(A)还是Leading(A)和Leading(B)是等效的?

解决方案

Leading Trailing 是特定于生成运算符优先级解析器的函数,仅在以下情况下适用您有一个运算符优先级语法.运算符优先语法是运算符语法的特例,并且运算符语法具有没有生产具有两个连续的非终结符的重要属性.

(从广义上讲,运算符优先级语法是可以用运算符优先级解析器:-解析的运算符语法.).但这并不重要.)

给定一个运算符语法,非终端的函数 Leading (分别为 Trailing )会产生一组终端(递归)为第一个终端(resp).last)终端,以某种形式从该非终端衍生而来.

另一个可能更直观的定义是,如果终端是可见的",则该终端在非终端的前导集中.从生产开始.非终端是透明的",因此可以通过查看前导的非终端或查看可见的非终端来看到终端.

例如,标准表达式语法(这是一种运算符语法;没有产生式有两个连续的非终结符):

  expr->因子"*"表示expr->因素因素->术语"+"因子因素->学期术语->ID术语->'('expr')' 

term 中, ID (从一开始就可见,并且 ID )从最后可见. expr 在任何一侧都不可见,因为它被终端隐藏了,因此我们不需要考虑它.

factor 中,在两端都可以看到 + ,并且 factor 也继承了 term的前导和尾随集,因为 term 在两端都是可见的.( factor 本身在末尾也是可见的,但是不能向尾随集添加任何新内容.)

最后,从 expr 的两端都可以看到 * ,而 expr 则是从 factor 继承的./p>

所以我们最终得到了:

 非终端领先尾随expr *,+,ID((*,+,ID,)因子+,ID,(+,ID,)术语ID(ID) 

由此,我们将构建优先级关系.基本上,如果找到

 非终端TERMINAL 

在任何生产中,然后为 Trailing(nonterminal)中的每个 TRAIL 添加优先级关系 TRAIL⋗TERMINAL .同样,每次出现

  TERMINAL非终端 

Leading(nonterminal)中的每个 LEAD 生成关系 TERMINAL⋖LEAD .最后,如果找到

  TERMINAL1 TERMINAL2 

  TERMINAL1非终端TERMINAL2 

然后生成关系 TERMINAL1·=·TERMINAL2 .

生成所有优先级关系后,您将查看每对终端 T,U .如果最多保留一个优先级关系-即 T⋖U,T⋗U,T·=·U 或从 T 没有关系U -那么您就有了运算符优先级语法.( T,U U,T 之间没有任何关系.优先级关系不是反对称的,不幸的是,它们在传统上都是用看起来像数字比较的符号来拼写的.)

I am seeking for a detailed algorithm describing how to generate leading and trailing sets for non-terminal symbols in context-free grammars.

I've found something like this: https://pl.scribd.com/doc/51358638/16/Operator-Precedence-Relations but I'm unsure about how it works. (See page 20)

Assume, that we have productions:

A -> YaZ | B

B -> b

then, it is said, that Leading(A) = {a}, Leading(A) = Leading(B) and Leading(B)={b}. And I have my doubts about this:

  1. Does it mean, that Leading(A) = {a, b} ? - I assume that in every step we do not replace the set already generated because of '=', but make sum of two sets.
  2. Does it mean, that Leading(B) is {b} or {a, b}? - Does the third rule mean, that we add Leading(B) to Leading(A) or that Leading(A) and Leading(B) are equivalent?

解决方案

Leading and Trailing are functions specific to generating an operator-precedence parser, which is only applicable if you have an operator precedence grammar. An operator precedence grammar is a special case of an operator grammar, and an operator grammar has the important property that no production has two consecutive non-terminals.

(An operator precedence grammar is, loosely speaking, an operator grammar which can be parsed with an operator precedence parser :-). But that's not important for now.)

Given an operator grammar, the function Leading (resp. Trailing) of a non-terminal produces the set of terminals which could be (recursively) the first (resp. last) terminal in some sentential form derived from that non-terminal.

Another possibly more intuitive definition is that a terminal is in the Leading set for a non-terminal if the terminal is "visible" from the beginning of a production. Non-terminals are "transparent", so a terminal could be visible either by looking through a leading non-terminal or by looking into a visible non-terminal.

For example, a standard expression grammar (which is an operator grammar; no production has two consecutive non-terminals):

expr   -> factor '*' expr
expr   -> factor
factor -> term '+' factor
factor -> term
term   -> ID
term   -> '(' expr ')'

From term, ID and ( are visible from the beginning, and ID and ) are visible from the end. expr is not visible from either side, because it is hidden by terminals, so we don't need to consider it.

From factor, + is visible from both ends, and factor also inherits the Leading and Trailing sets of term because term is visible from both ends. (factor is also visible from the end in itself, but that cannot add anything new to the Trailing set.)

Finally, from expr, * is visible from both ends, and expr inherits from factor.

So we end up with:

Non-terminal            Leading             Trailing
expr                    *, +, ID, (         *, +, ID, )
factor                  +, ID, (            +, ID, )
term                    ID, (               ID, )

From that, we're going to construct precedence relations. Basically, if you find

nonterminal TERMINAL

in any production, then you add the precedence relations TRAIL ⋗ TERMINAL for every TRAIL in Trailing(nonterminal). Similarly, every occurrence of

TERMINAL nonterminal

generates the relationships TERMINAL ⋖ LEAD for every LEAD in Leading(nonterminal). And finally, if you find

TERMINAL1 TERMINAL2

or

TERMINAL1 nonterminal TERMINAL2

then you generate the relationship TERMINAL1 ·=· TERMINAL2.

Once you've generated all the precedence relationships, you look at every pair of terminals T, U. If at most one precedence relation holds -- that is, T ⋖ U, T ⋗ U, T ·=· U or there is no relationship from T to U -- then you have an operator precedence grammar. (There is no connection between T, U and U, T. Precedence relationships are not antisymmetric and it is unfortunate that they are traditionally spelled with symbols that look like numeric comparison.)

这篇关于计算前导集和尾随集以实现无上下文语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆