前瞻集的精确定义是什么？ [英] What is the precise definition of a lookahead set?

查看：139 发布时间：2016/12/22 20:09:15 compiler-construction compiler-theory

本文介绍了前瞻集的精确定义是什么？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在编写编译器和学习语法分析背后的理论。我发现，即使它是理解识别算法的一个关键概念，在网络上的信息相当差。看起来StackOverflow在一个独特的位置来解决这个问题。

解决方案

语法的前瞻性集合定义每个非终端的前瞻设置，这又依赖于每个制作的前瞻设置。确定前瞻集可以帮助我们确定语法是否为LL（1），如果是，那么我们需要什么信息为其构造递归下降解析器。

定义： ：LOOKAHEAD（X - >α）和 LOOKAHEAD（X）：

  LOOKAHEAD（X→α）= FIRST（α）U FOLLOW（X），如果NULLABLE（α）
 LOOKAHEAD ），如果不是NULLABLE（α）
 LOOKAHEAD（X）= LOOKAHEAD（X→α）U LOOKAHEAD（X→β）U LOOKAHEAD（X→γ）

其中 FIRST（α）是α可以开始的终端集合，是可以在语法中的任何位置之后的 X 的终端的集合，并且 NULLABLE（α）是α是否可以导出空序列端子（表示为ε）。以下定义取自Torben Mogensen的免费书，编译器设计基础。：： / p>

  NULLABLE（ε）= true 
 NULLABLE（x）= false，如果x是终端
 NULLABLE （αβ）= NULLABLE（α）和NULLABLE（β）
 NULLABLE（P）= NULLABLE（α_1）或NULLABLE（α_2）或...或NULLABLE所有其产生式的右侧和右侧
是α_1，α_2，...，α_n。

定义： FIRST

  FIRST（ε）=Ø
 FIRST（x）= {x}，假设x是终端
 FIRST（αβ）= FIRST（α）U FIRST（β），如果NULLABLE（α）
 = FIRST（α），如果不为NULLABLE（α）
 FIRST（P）= FIRST ）U FIRST（α_2）U ... U FIRST（α_n），
如果P是非终端并且其所有乘积的右侧
是α_1，α_2，... ，α_n。

定义： FOLLOW

如果且仅当从开始的派生时，终端符号a在 FOLLOW符号S的语法，使得S⇒αXaβ，其中α和β是（可能为空的）语法符号序列。

strong>直观：：

/ em>出现在语法中。可以（直接或通过任何级别的递归）跟踪的所有终端都在 FOLLOW（X）中。此外，如果在生产结束时出现 X （例如 A - > foo X ），到ε（例如 A - > foo XB 和 B - >ε），后面还可以跟随（ FOLLOW（A）⊆FOLLOW（X））。 p>

查看Torben的书中确定 FOLLOW（X）的方法以及下面的演示。 p>

示例：
E - > ; n A A - > E B A - > ε B - > + A B - > * A
首先， NULLABLE 和 FIRST 并确定：
NULLABLE（E）= NULLABLE（n A）= NULLABLE NULLABLE（A）= NULLABLE（EB）∨NULLABLE（ε）= true NULLABLE（B）= NULLABLE（+ A）∨NULLABLE（* A）= false FIRST（E）= FIRST（n）= {n} FIRST（A）= FIRST（EB）U FIRST（ε）= FIRST（E）UØ= {n}（因为E不是NULLABLE） FIRST（B）= FIRST（+ A）U FIRST（* A）= FIRST（+）U FIRST（*）= {+，*}
在确定 FOLLOW 之前，生产 E'添加E $ ，其中 $ 被视为非终端的文件结束。然后确定：
FOLLOW（E）：让β= $， FIRST（$）= {$}⊆FOLLOW（E）让β= B，因此添加约束FIRST（B）= {+，*}⊆FOLLOW（E） FOLLOW ）：令β=ε，因此添加FIRST（ε）=Ø⊆FOLLOW（A）的约束。因为NULLABLE（ε），添加约束FOLLOW（E）⊆FOLLOW（A）。令β=ε，因此添加FIRST（ε）=Ø⊆FOLLOW（A）的约束。因为NULLABLE（ε），添加约束FOLLOW（B）⊆FOLLOW（A）。令β=ε，因此添加FIRST（ε）=Ø⊆FOLLOW（A）的约束。因为NULLABLE（ε），添加约束FOLLOW（B）⊆FOLLOW（A）。 FOLLOW（B）：令β=ε，因此添加FIRST（ε）=Ø⊆FOLLOW（B）的约束。因为NULLABLE（ε），添加约束FOLLOW（A）⊆FOLLOW（B）。
解决这些约束（也可以通过定点迭代实现），
{+，*，$}⊆FOLLOW（E） FOLLOW（E）⊆FOLLOW（A） FOLLOW ）= FOLLOW（B） FOLLOW（E）= FOLLOW（A）= FOLLOW（B）= {+，*，$}。
现在可以确定每个产品的 LOOKAHEAD ：
LOOKAHEAD（E→n A）= FIRST（n A）= {n}因为¬NULLABLE（n A） LOOKAHEAD （E） LOOKAHEAD（A→ε）= FIRST（E）= {n}因为¬NULLABLE（EB） = FIRST FIRST（ε）U FOLLOW（A）因为NULLABLE（ε） =ØU {+，*，$} = {+，*，$} LOOKAHEAD（B-→A） FIRST（+ A），因为¬NULLABLE（+ A） = FIRST（+）= {+}因为¬NULLABLE（+） LOOKAHEAD（B-→A）= {*}相同的原因
最后，可以确定每个非终端的 LOOKAHEAD
LOOKAHEAD（E）= LOOKAHEAD（E - > n A）= {n} LOOKAHEAD = LOOKAHEAD（A→EB）U LOOKAHEAD（A→ε）= {n} U {+，*，$} LOOKAHEAD（B）= LOOKAHEAD（B→+ A）U LOOKAHEAD （B - > * A）= {+，*}
该语法不是LL（1），因为其非终端具有重叠的前瞻集。（也就是说，我们不能创建一个程序一次读取一个符号，并明确决定使用哪个生产）。

I'm toying around with writing compilers and learning about the theory behind syntax analysis. I've found that even though it's a key concept for understanding recognition algorithms, information about it on the net is fairly poor. It seems StackOverflow is in a unique position to fix this problem.
解决方案
The lookahead sets for a grammar is defined in terms of the lookahead sets for each of its non-terminals, which in turn rely on the lookahead set for each production. Determining lookahead sets can help us determine if a grammar is LL(1), and if it is, what information we need to construct a recursive-descent parser for it.

Definition: LOOKAHEAD(X -> α) and LOOKAHEAD(X):
LOOKAHEAD(X -> α) = FIRST(α) U FOLLOW(X), if NULLABLE(α) LOOKAHEAD(X -> α) = FIRST(α), if not NULLABLE(α) LOOKAHEAD(X) = LOOKAHEAD(X -> α) U LOOKAHEAD(X -> β) U LOOKAHEAD(X -> γ)
where FIRST(α) is the set of terminals that α can begin with, FOLLOW(X) is the set of terminals that can come after an X anywhere in the grammar, and NULLABLE(α) is whether α can derive an empty sequence of terminals (denoted ε). The following definitions are taken from Torben Mogensen's free book, Basics of Compiler Design. See below for an example.

Definition: NULLABLE(X):
NULLABLE(ε) = true NULLABLE(x) = false, if x is a terminal NULLABLE(αβ) = NULLABLE(α) and NULLABLE(β) NULLABLE(P) = NULLABLE(α_1) or NULLABLE(α_2) or ... or NULLABLE(α_n), if P is a non-terminal and the right-hand-sides of all its productions are α_1, α_2, ..., α_n.
Definition: FIRST(X):
FIRST(ε) = Ø FIRST(x) = {x}, assuming x is a terminal FIRST(αβ) = FIRST(α) U FIRST(β), if NULLABLE(α) = FIRST(α), if not NULLABLE(α) FIRST(P) = FIRST(α_1) U FIRST(α_2) U ... U FIRST(α_n), if P is a non-terminal and the right-hand-sides of all its productions are α_1, α_2, ..., α_n.
Definition: FOLLOW(X):

A terminal symbol a is in FOLLOW(X) if and only if there is a derivation from the start symbol S of the grammar such that S ⇒ αX aβ, where α and β are (possibly empty) sequences of grammar symbols.

Intuition: FOLLOW(X):

Look at where X occurs in the grammar. All terminals that could follow it (directly or by any level of recursion) is in FOLLOW(X). Additionally, if X occurs at the end of a production (e.g. A -> foo X), or is followed by something else that can reduce to ε (e.g. A -> foo X B and B -> ε), then whatever A can be followed by, X can also be followed by (i.e. FOLLOW(A) ⊆ FOLLOW(X)).

See the method for determining FOLLOW(X) in Torben's book and a demonstration of it just below.

An example:
E -> n A A -> E B A -> ε B -> + A B -> * A
First, NULLABLE and FIRST and are determined:
NULLABLE(E) = NULLABLE(n A) = NULLABLE(n) ∧ NULLABLE(A) = false NULLABLE(A) = NULLABLE(E B) ∨ NULLABLE(ε) = true NULLABLE(B) = NULLABLE(+ A) ∨ NULLABLE(* A) = false FIRST(E) = FIRST(n A) = {n} FIRST(A) = FIRST(E B) U FIRST(ε) = FIRST(E) U Ø = {n} (because E is not NULLABLE) FIRST(B) = FIRST(+ A) U FIRST(* A) = FIRST(+) U FIRST(*) = {+, *}
Before FOLLOW is determined, the production E' -> E $ is added, where $ is considered the "end-of-file" non-terminal. Then FOLLOW is determined:
FOLLOW(E): Let β = $, so add the constraint that FIRST($) = {$} ⊆ FOLLOW(E) Let β = B, so add the constraint that FIRST(B) = {+, *} ⊆ FOLLOW(E) FOLLOW(A): Let β = ε, so add the constraint that FIRST(ε) = Ø ⊆ FOLLOW(A). Because NULLABLE(ε), add the constraint that FOLLOW(E) ⊆ FOLLOW(A). Let β = ε, so add the constraint that FIRST(ε) = Ø ⊆ FOLLOW(A). Because NULLABLE(ε), add the constraint that FOLLOW(B) ⊆ FOLLOW(A). Let β = ε, so add the constraint that FIRST(ε) = Ø ⊆ FOLLOW(A). Because NULLABLE(ε), add the constraint that FOLLOW(B) ⊆ FOLLOW(A). FOLLOW(B): Let β = ε, so add the constraint that FIRST(ε) = Ø ⊆ FOLLOW(B). Because NULLABLE(ε), add the constraint that FOLLOW(A) ⊆ FOLLOW(B).
Resolving these constraints (could also be achieved by fixed-point iteration),
{+, *, $} ⊆ FOLLOW(E) FOLLOW(E) ⊆ FOLLOW(A) FOLLOW(A) = FOLLOW(B) FOLLOW(E) = FOLLOW(A) = FOLLOW(B) = {+, *, $}.
Now LOOKAHEAD for each production can be determined:
LOOKAHEAD(E -> n A) = FIRST(n A) = {n} because ¬NULLABLE(n A) LOOKAHEAD(A -> E B) = FIRST(E B) because ¬NULLABLE(E B) = FIRST(E) = {n} because ¬NULLABLE(E) LOOKAHEAD(A -> ε) = FIRST(ε) U FOLLOW(A) because NULLABLE(ε) = Ø U {+, *, $} = {+, *, $} LOOKAHEAD(B -> + A) = FIRST(+ A) because ¬NULLABLE(+ A) = FIRST(+) = {+} because ¬NULLABLE(+) LOOKAHEAD(B -> * A) = {*} for the same reason
Finally, LOOKAHEAD for each non-terminal can be determined:
LOOKAHEAD(E) = LOOKAHEAD(E -> n A) = {n} LOOKAHEAD(A) = LOOKAHEAD(A -> E B) U LOOKAHEAD(A -> ε) = {n} U {+, *, $} LOOKAHEAD(B) = LOOKAHEAD(B -> + A) U LOOKAHEAD(B -> * A) = {+, *}
By this knowledge we can determine that this grammar is not LL(1) because its non-terminals have overlapping lookahead sets. (I.e., we cannot create a program that reads one symbol at a time and decides unambiguously which production to use.)

这篇关于前瞻集的精确定义是什么？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

前瞻集的精确定义是什么？ [英] What is the precise definition of a lookahead set?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

前瞻集的精确定义是什么？ [英] What is the precise definition of a lookahead set?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭