Earley无法处理图表中已包含的ε状态 [英] Earley cannot handle epsilon-states already contained in chart

查看:136
本文介绍了Earley无法处理图表中已包含的ε状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用队列处理状态来实现 Earley解析器.队列中植入了顶级规则.对于队列中的每个状态,通过将新状态添加到队列中来执行操作之一(预测,扫描,完成). 没有添加重复的状态.

我遇到的问题最好用以下语法来描述:

解析A时,会发生以下情况:

如您所知, A不会得到完全解决.这是因为带有epsilon状态的完成只会发生一次,因为它不会添加到队列中.

如何调整我的算法以支持这些ε状态?

请注意,使用终端机时这不是问题,因为将创建新的图表集以插入扫描状态.由于状态尚不存在,因此将对其进行处理.

解决方案

在第4部分中隐藏的John a Aycock和R. Nigel Horspool撰写的"Practical Earley Parsing"一文中,是关于如何处理可为空的规则的说明:

如果[A→ ...•B ...,j]在S i 中,添加[B→ •a,i]
到S i 的所有规则B→一种.
如果B为可为空,则还要添加[A→ ... B•...,j]到S i

因此,在您的示例中,根据 A→的预测•B B 将产生以下规则:

(1)B→ •
(2)→ B•B
(3)→ B B•

关键是这发生在预测阶段.在预测阶段,如果后点"符号可为空(直接或通过转移),则将其向右移动并添加该规则.

所以基本上:

A→ •B B 产生( B→• A→ B•B ),每个都排队并处理
A→ B•B 产生( A→ B B•),该队列已排队并已处理

I have implemented the Earley parser using a queue to process states. The queue is seeded with the top-level rule. For each state in the queue, one of the operations (prediction, scanning, completion) is performed by adding new states to the queue. Duplicate states are not added.

The problem I am having is best described with the following grammar:

When parsing A, the following happens:

As you can tell, A will not be fully resolved. This is because the completion with the epsilon state will only happen once as it is not added to the queue.

How can I adapt my algorithm to support these epsilon-states?

Edit: Note that this is not an issue when using terminals as a new chart set will be created to insert the scanned state. As the state does not exist there already, it will be processed.

解决方案

In the paper "Practical Earley Parsing" by John a Aycock and R. Nigel Horspool hidden in section 4 is a statement on how to handle nullable rules:

If [A→ ... •B ..., j] is in Si, add [B→ • a, i]
to Si for all rules B → a.
If B is nullable, also add [A → ... B • ..., j] to Si

So in your example, in the prediction of A→ • B B the following rules would be produced:

(1) B → •
(2) A → B • B
(3) A → B B •

The key is this happens in the prediction phase. During the prediction phase if the 'post dot' symbol is nullable (both directly and through transference) then move the dot right and add that rule as well.

So basically:

A → • B B produces (B → • and A → B • B) each being queued and processed
A → B • B produces (A → B B •) which is queued and processed

这篇关于Earley无法处理图表中已包含的ε状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆