重置弯曲和/或野牛的状态 [英] Resetting the state of flex and/or bison

查看:62
本文介绍了重置弯曲和/或野牛的状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为一个玩具项目的一部分,我一直在尝试根据flex/bison对别人的解析器进行一些小的修改.我真的都没有经验.您可以在此处找到原始解析器.

As part of a toy project I've been trying to make a small modification of someone else's parser based on flex/bison. I'm really not experienced with either. You can find the original parser here.

我一直试图将一个简单的函数组合在一起,该函数接受一个字符串并返回一个解析树,因此我可以通过FFI公开此函数以供另一种编程语言使用.我所拥有的大部分都是基于原始程序中的main()函数,我的屠宰版本如下:

I've been trying to put together a simple function that accepts a string and returns a parse tree, so I can expose this via FFI for use in another programming language. What I have is mostly based on the main() function in the original program, my butchered version is below:

TreeNode* parse_string(char *s)
{
    FILE *in = fmemopen(s, strlen(s), "r");
    lex2_initialise();
    parse_file(in);
    fclose(in);
    preprocess_tokens();
    yyparse();
    return top;
}

至少在我第一次调用它时,它实际上可以正常工作.它第二次抱怨令牌解析错误,并且使用的错误报告功能似乎是在调用yyparse()的过程中从生成的解析器中的goto语句迷宫中的某个地方调用的,这时我不知道发生了什么不再.

This actually works fine, at least the first time I call it. The second time it complains about misparsed tokens, and the error reporting function used appears to be called from somewhere inside a maze of goto statements within the generated parser during the call to yyparse(), at which point I don't understand what's going on anymore.

原始程序本身似乎只是设计为先将所有输入都输入然后退出,因此它并没有给我太多我所缺少的线索.抛开这个不完全古怪的想法,在程序的其余部分中保留了一些旧状态,我的主要问题是:

The original program itself only appears to be designed to take all its input upfront and then exit, so it doesn't leave me with much clue of what I'm missing. Putting aside the not-altogether-outlandish idea some old state is being retained elsewhere in the rest of the program, my main questions are:

  • Flex或Bison是否在调用yyparse()之间保持全局状态
  • 是否可以在上面函数的末尾添加一些简单的函数调用以擦除所有内容并将所有内容重置为初始状态?

推荐答案

Flex或Bison是否在调用yyparse()之间保持全局状态

Do either Flex or Bison maintain global state between calls to yyparse()

Flex维护有关当前输入流的信息.如果解析不消耗整个输入流(这对于因错误而异常终止的解析器是很常见的),则对yyparse的下一次调用将继续从上一个中断的地方读取.提供一个新的输入缓冲区将(大部分)将重置词法分析器的状态,但是可能有些方面尚未重置,特别是当前的启动条件,以及条件堆栈(如果已启用该选项).

Flex maintains information about the current input stream. If the parse does not consume the entire input stream (which is quite common for parsers which terminate abnormally on errors), then the next call to yyparse will continue reading from where the previous one left off. Providing a new input buffer will (mostly) reset the lexer's state, but there may be some aspects which have not been reset, notably the current start condition, and the condition stack if that option has been enabled.

由野牛生成的解析器不依赖于全局状态.它旨在在从yyparse返回之前清除其内部状态.但是,如果解析器操作直接执行return语句(不建议使用 ),则将绕过清除操作,这很可能造成内存泄漏.提前终止解析的操作应使用宏YYACCEPTYYABORT而不是return语句.

The bison-generated parser does not rely on global state. It is designed to clear its internal state prior to returning from yyparse. However, if a parser action executes a return statement directly (this is not recommended), then the cleanup will be bypassed, which is likely to create a memory leak. Actions which prematurely terminate the parse should use the macros YYACCEPT or YYABORT rather than a return statement.

我可以在上面函数的末尾添加一些简单的函数调用来擦除所有内容并将所有内容重置为初始状态吗?

Is there some simple function call I could put at the end of the function above to wipe it all and reset everything back to the initial state?

默认由flex生成的解析器(设计为每次需要令牌时都会调用)在很大程度上依赖于全局变量.大部分但不是全部的flex状态都保留在当前的YY_BUFFER_STATE中(保留在全局变量中),并且可以通过yyreset函数或提供字符的任何函数来重置该对象缓冲区作为词法分析器输入.但是,这些功能不会重置启动条件,也不会刷新条件堆栈(如果启用)或缓冲区堆栈.如果要完全重置状态,则需要手动刷新堆栈,并使用BEGIN(INITIAL)重置开始条件.

The default flex-generated parser, which is designed to be called every time a token is required, is heavily reliant on global variables. Most, but not all, of the flex state is maintained in the current YY_BUFFER_STATE (which is kept in a global variable), and that object can be reset by the yyreset function, or any of the functions which provide a character buffer as lexer input. However, these functions do not reset the start condition nor do they flush the condition stack (if enabled), or the buffer stack. If you want to reset the state completely, you need to flush the stacks manually, and reset the start condition with BEGIN(INITIAL).

制作更易于重启的扫描器的一种方法是构建可重入扫描器.可重入的扫描器将其所有状态(包括启动条件和缓冲区堆栈)保留在一个扫描器结构中,这意味着您可以简单地通过创建新的扫描器结构(当然也可以销毁旧的扫描器结构以避免这种情况)来完全重置扫描器状态内存泄漏.)

One approach to making a more easily restartable scanner is to build a reentrant scanner. A reentrant scanner keeps all of its state (including start conditions and buffer stack) in a scanner structure, which means that you can completely reset the scanner state simply by creating a new scanner structure (and, of course, destroying the old one to avoid leaking memory.)

使用重入式扫描仪有很多充分的理由[注1].一方面,它允许您同时激活多个解析器,并且消除了对全局状态的依赖.但是不幸的是,这不像设置flex选项那样简单.

There are lots of good reasons to use reentrant scanners [Note 1]. For one thing, it allows you to have more than one parser active at the same time, and it eliminates a reliance on global state. But unfortunately, it's not as simple as just setting a flex options.

可重入扫描器具有不同的API(其中包括指向扫描器状态结构的指针).此状态结构需要传递到yyparse中,而yyparse需要将其传递到yylex中;所有这些都需要对野牛选项进行一些修改.另外,可重入扫描器不能使用全局yylval来将令牌的语义值传递给解析器[注2].

Reentrant scanners have a different API (which includes a pointer to the scanner state structure). This state structure needs to be passed into yyparse and yyparse needs to pass it to yylex; all of this requires some modifications to the bison options. Also, reentrant scanners cannot use the global yylval to communicate the semantic value of a token to the parser [Note 2].

如果您使用 %bison-bridge 选项并告诉野牛要生成可重入解析器,则yylex将被期望使用另一个附加参数(或两个,如果您使用位置)被调用,可重入野牛解析器将提供附加参数.一切正常,但具有将yylval(和yylloc,如果使用的话)更改为指针的效果,这意味着您需要完成将yylval.something更改为yylval->something的所有扫描程序操作. /p>

注释

If you use the %bison-bridge option and tell bison to generate a reentrant parser, then yylex will expect to be called with another additional parameter (or two, if you use locations), and the reentrant bison parser will supply the additional parameters. That all works fine, but it has the effect of changing yylval (and yylloc, if used) to a pointer, which means that you need to go through all the scanner actions changing yylval.something to yylval->something.

  1. 您还可以使用一些其他的野牛选项来创建可重入解析器.通常,由bison生成的解析器使用的唯一可变全局变量是yylvalyylloc(如果使用位置报告). (并且yynerrs,但是很少在解析器操作之外引用该变量.)指定可重入解析器会将这些全局变量转换为词法分析器自变量,但不会创建外部可见的解析器状态结构.但是,它还为您提供了使用推式解析器"的选项,该按钮确实具有持久的解析器状态结构.在某些情况下,推送解析器的灵活性可以大大简化扫描程序.

  1. You can also create a reentrant parser, using some additional bison options. Normally, the only mutable globals used by a bison-generated parser are yylval and yylloc (if you use location reporting). (And yynerrs, but it is rare to refer to that variable outside of a parser action.) Specifying a reentrant parser turns those globals into lexer arguments, but it does not create an externally visible parser state structure. But it also gives you the option of using a "push parser", which does have a persistent parser state structure. In some cases, the flexibility of push parsers can significantly simplify scanners.

严格来说,没有什么可以阻止您创建可重入的扫描器,该扫描器仍然使用全局变量与解析器进行通信,只是它不再是可重入的.由于明显的原因,我不建议您使用此选项,但是您可能希望将其作为一种过渡策略,因为它需要对解析器和扫描器操作进行较少的修改.

Strictly speaking, nothing stops you from creating a reentrant scanner which still uses globals to communicate with the parser, except that it is not really reentrant any more. I wouldn't recommend this option for obvious reasons, but you might want to do it as a transitional strategy, since it requires less modification to the parser and to scanner actions.

这篇关于重置弯曲和/或野牛的状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆