语法分析器错误恢复可以自动引导吗? [英] Can parser error recovery be guided automatically by the grammar?

查看:63
本文介绍了语法分析器错误恢复可以自动引导吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个LALR解析器生成器作为宠物项目.

I'm writing an LALR parser generator as a pet project.

我正在使用紫色的龙书来帮助我进行设计,而我从中收集到的是解析器中有四种错误恢复方法:

I'm using the purple dragon book to help me with the design, and what I gather from it is that there are four methods of error recovery in a parser:

  • 紧急模式:开始转储输入符号,直到找到由编译器设计人员预先选择的符号
  • 短语级恢复:将输入字符串修改为允许当前产量减少的内容
  • 错误产生:通过将错误合并到语法中来预测错误
  • 全局更正:短语级别恢复的方式更加复杂(据我了解)

其中两个需要修改输入字符串(我想避免),另外两个需要编译器设计人员预测错误并根据他们对语言的了解来设计错误恢复.但是解析器生成器也具有关于语言的知识,因此我很好奇是否有更好的方法可以从解析错误中恢复,而无需预先选择同步标记或用错误产生物填充语法.

Two of these require modifying the input string (which I'd like to avoid), and the other two require the compiler designer to anticipate errors and design the error recovery based on their knowledge of the language. But the parser generator also has knowledge about the language, so I'm curious if there's a better way to recover from parsing errors without pre-selecting synchronizing tokens or filling up the grammar with error productions.

解析器不能只选择同步令牌,而不能仅将当前产生的所有非终结符后面的符号视为同步令牌来对待?我还没有真正弄清楚这将如何工作-我将解析器形象化地放在了一系列正在进行的生产中,但是,这当然不是自底向上解析器的工作方式.试图找到可行状态会产生太多无关的错误吗?它会尝试以无效状态恢复解析器吗?有没有一种好的方法可以用有效的错误操作预先填充解析器表,这样,当遇到错误时,实际的解析程序就不必推理下一步该怎么做了?

Instead of picking synchronizing tokens, can't the parser just treat symbols in the follow of all the nonterminals the current production can reduce to as synchronizing tokens? I haven't really worked out how well that would work - I visualize the parser being down a chain of in-progress productions but of course that's not how bottom-up parsers work. Would it produce too many irrelevant errors trying to find a workable state? Would it attempt to resume the parser in an invalid state? Is there a good way to pre-fill the parser table with valid error actions so the actual parsing program doesn't have to reason about where to go next when an error is encountered?

推荐答案

当您尝试盲目跟踪所有可用作品时,很容易迷失在死胡同中.您了解某些有关您的语言的知识,而解析器生成器将很难弄清楚. (例如,跳到下一个语句定界符很可能使解析得以恢复.)

It's way too easy to get lost in a dead-end when you try to blindly follow all available productions. There are things that you know about your language which it would be very difficult for the parser generator to figure out. (Like, for example, that skipping to the next statement delimiter is very likely to allow the parse to recover.)

这并不是说尚未尝试过自动化过程. 解析理论(Sippu& Soisalon-Soininen)中对此有很长的篇幅. (不幸的是,本文是付费的,但如果您具有ACM成员身份或访问权限,到一个好的图书馆,您可能可以找到它.)

That's not to say that automated procedures haven't been tried. There is a long section about it in Parsing Theory (Sippu & Soisalon-Soininen). (Unfortunately, this article is paywalled, but if you have an ACM membership or access to a good library, you can probably find it.)

总的来说,yacc策略已被证明不可怕",甚至足够好".有一种众所周知的改进方法,即收集真正错误的语法错误消息(或错误恢复失败),将其跟踪到出现时处于活动状态(这很容易做到),并附加一个错误恢复过程,以达到精确的状态和前瞻性标记.例如,参见 Russ Cox的方法.

On the whole, the yacc strategy has proven to be "not awful", and even "good enough". There is one well-known way of making it better, which is to collect really bad syntax error messages (or failed error recovery), trace them to the state which is active when they occur (which is easy to do), and attach an error recovery procedure to that precise state and lookahead token. See, for example, Russ Cox's approach.

这篇关于语法分析器错误恢复可以自动引导吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆