扫描器与解析器的交互 [英] Scanner and parser interaction

查看:61
本文介绍了扫描器与解析器的交互的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是flex/bison的新手.在阅读书籍时,似乎在几乎所有编译器实现中,解析器都以协程"方式与扫描程序交互,即只要解析器需要令牌,它就会调用扫描程序获取令牌,并在忙碌时将扫描程序放在一边移位/减少.一个自然的问题是,为什么不让扫描程序整体生成令牌流(从输入字节流),然后将整个令牌流传递给解析器,因此在两者之间没有明确的交互.他们俩?好吧,我可以想象这种方式存在一些缺点,并且我也可以看到这样做的一些好处.

I am new to flex/bison. Reading books, it seems that in nearly all compiler implementations, the parser interacts with the scanner in a "coroutine" manner, that whenever the parser needs a token, it calls the scanner to get one, and left the scanner aside when it's busy on shift/reduce. A natural question is that why not let the scanner produces the token-stream (from the input byte-stream) as a whole, and then pass the entire token-stream to the parser, thus there is no explicit interaction betw. the two? Well, I can image that there are some drawbacks in this manner, and I can also see some benefits of doing so.

我的问题是,在那方面是否存在某种全面"的讨论,或者除协程"方式以外,是否有任何编译器实现都使用不同的扫描器/解析器交互方案?

My question is, is there a sort of "comprehensive" discussion on that aspect, or is there any compiler implementation uses different scanner/parser interaction scheme other than "coroutine" manner?

推荐答案

在传统安排中,解析器在需要令牌时会调用扫描器.

In the traditional arrangement, the parser calls the scanner whenever it needs a token.

这与在每次需要更多输入时调用I/O库的扫描器(或许多其他程序)中使用的逻辑相同.通常不将其描述为协程,而且我也不相信这也是对解析器/扫描器交互的准确描述.

That's the same logic as used in the scanner (or many other programs) which call the I/O library every time they need more input. That's not usually described as a coroutine, and I'm not convinced it's an accurate description of the parser/scanner interaction either.

在协程控制流程中,两个函数串联调用.这通常不是处理I/O的方式. fread()接口的确保留下一次调用的状态(至少是文件位置,也许还有一个缓冲区),但调用是独立的.

In coroutine control flow, two functions call each other in tandem. That's not usually the way I/O is handled. The fread() interface does maintain state for the next call (the file position, at least, and maybe a buffer) but it the calls are self contained.

从某种意义上说,调用yylex()获取下一个令牌和调用scanf()获取下一个数据值之间没有区别.

In a sense, there is no difference between calling yylex() to get the next token and calling scanf() to get the next data value.

对于扫描仪来说,这并不总是最方便的体系结构.有时,扫描程序能够将令牌馈入解析器会很方便.典型的用例是扫描程序生成令牌(例如通过宏扩展),但有时只是单个扫描程序模式的匹配包含多个令牌.

This is not always the most convenient architecture for a scanner. Sometimes, it would be convenient for the scanner to be able to feed tokens into the parser. A typical use case is when the scanner is generating tokens, for exanple through macro expansion, but sometimes it is just that the match of a single scanner pattern contains more than one token.

许多解析器生成器(包括Bison)都可以生成可调用的解析器,通常称为推送解析器".在此模型中,扫描程序使用每个成功令牌调用解析器.实际上,这仍然不是协程模型.这只是控制流反转.与普通I/O的类比,相当于调用fgets()的数据处理器读取每条输入行,并将其重写为process_line()函数,该函数将获得要处理的数据行(因此不会与I/O库进行交互).在Lemon解析器生成器中可以找到推送解析的早期实现.

Many parser generators, including Bison, can generate callable parsers, usually called "push parsers". In this model, the scanner calls the parser with each succesive token. This is still not a coroutine model, really; it is just control-flow inversion. In the analogy with ordinary I/O, it's the equivalent of taking a data processor which called fgets() to read each input line and rewriting it as a process_line() function which is given a line of data to process (and thus does not interact with the I/O library). An early implementation of push parsing can be found in the Lemon parser generator.

类似协程的控制流对于创建解析器是很有用的,该解析器的最终输入流必须异步处理.但这实际上并不需要解析器和扫描器之间的关联.相反,它需要在扫描仪和输入流之间进行关联.同样,协程并不是真正必要的,并且可能是过大的:反转控制流就足够了. Flex不提供推送扫描器"界面,但其他扫描器生成器提供.我相信例如Re2c支持此功能.

Coroutine-like control flow could be useful for creating a parser whose eventual input stream must be handled asynchronously. But that doesn't really require coroutining between the parser and the scanner; rather, it requires coroutining between the scanner and the input stream. Again, coroutining is not really necessary and might be overkill: inverting control flow should suffice. Flex does not provide a "push scanner" interface, but other scanner generators do. I believe this feature is supported by Re2c, for example.

这篇关于扫描器与解析器的交互的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆