对野牛的Flex新行扫描 [英] Flex newline scanning for bison

查看:134
本文介绍了对野牛的Flex新行扫描的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用相同的Flex /野牛扫描/分析器跨preTER以及加载一个文件是间preTED。我不能换行解析在这两种情况下正常工作。


  1. 国米preTER:有一个提示,我可以进入由pressing ENTER终止命令

  2. 文件:下面是一个例子输入文件:

-----割---------

 开始(
    打印(做得好),1)

----切-------

因此​​,有在第一线和后一个新行的(应该被食用。

在我scanner.l我有

  %%
[\\ t] {errorLineCol + = strlen的(yytext中); }\\ N {errorLineNumber ++;
                                errorLineCol = 0; }( - [0-9]?)[0-9] * {errorLineCol + = strlen的(yytext中);
                                yylval =字符串到整数(yytext中);
                                返回TINTEGER; }

......

此则适用于该文件的场景,但不用于帧间preTER。我将要后按ENTER preSS和额外Ctrl + D键。如果我改变

  \\ N {errorLineNumber ++;
                                errorLineCol = 0;
                                返回0; }

然后跨preTER作品,但没有文件读取;然后它遇到的第一个换行符后停止。什么是解决这个问题的好办法?

编辑:

下面是解析器的顶层

 输入:uexpr {parseValue = $ 1; }
    | / *空* / {parseValue = myNull; }
    |错误{parseValue = myNull; }
    ;uexpr:列表
    |原子
    ;

可能的解决方案:似乎是使用

  \\ N {errorLineNumber ++;
                                errorLineCol = 0;
                                如果(yyin中==标准输入)返回0; }


解决方案

主要的问题是,你的解析器函数 ypparse 不会返回,直到它降低了整个语言的开始符号。

如果你的语法的顶层是这样的:

 语言:命令;命令:命令命令| / * *空/;

当然,机器会期待一个完整的脚本(由你键入Ctrl-D终止)。如果您间preTER是这样的逻辑:

 循环:
  打印(提示>中)
  yyparse()
  如果(空语句)
    打破

因为 yyparse ,它不会工作在返回之前消耗了整个脚本。

返回0; 由于令牌值0表示 EOF 解析器解决了这个互动的模式问题,使其觉得剧本已经结束。

我不同意让的解决方案\\ n 令牌同意。它只会复杂语法(迄今微不足道的一块空白的,现在显著),并最终无法工作,因为 yyparse 功能仍然要处理的完整的语法。也就是说,如果你有换行符作为标记,但再$ P $文法的开始符号psents整个脚本, yyparse 仍然无法返回到您的交互提示循环

一个快速和肮脏的黑客是让词法分析器知道交互模式是否有效。然后,它可以conditionaly 返回0; 为一个换行符的每个实例,如果它是交互模式。如果输入的不是一个完整的语句,因为脚本作为一个整体在换行符结束会出现语法错误。在正常的文件读取模式中,你可以词法分析器吃所有空白,而无需返回,如允许整个文件之前与处理单个 yyparse

如果你想交互输入和文件读取,而不在词法分析器中执行行为的两种模式,你可以做的就是改变语法,因此只能解析语言的一个声明: yyparse 为你的语言的每个顶级语句函数返回。 (和词法分析器吃新行像以前一样,没有返回0)。即语法的开始符号仅仅是一个声明(可能为空)。然后你的文件解析器必须实现为一个循环(由您写的)它调用yyparse来从文件中的所有语句,直到 yyparse 遇到一个空的输入。这种方法的缺点是,如果用户键入不完整的语法(例如晃来晃去开括号),解析器会继续扫描输入,直到满意。这是不友好的,就像使用 scanf函数交互式用户输入(它的同样的问题的方案: scanf函数是一个解析器不会返回,直到它是在线北京)。

另一种可能性是具有它执行其自己的用户输入而不是调用yyparse获得输入交互模式的的解析它。在此模式下,读取用户输入的行缓冲器。那么你有解析器处理行缓冲区。要处理的行缓冲器,而不是 FILE * 流是完全可能的。你只需要编写自定义的输入处理(你自己的 YY_INPUT 宏定义)。这是如果实现与行编辑和历史回忆,例如一个体面的交互模式,你会最终需要反正方法使用 libedit GNU的readline

I'd like to use the same flex/bison scanner/parser for an interpreter and for loading a file to be interpreted. I can not get the newline parsing to work correctly in both cases.

  1. Interpreter: There is a prompt and I can enter commands terminated by pressing ENTER.
  2. File: Here is an example input file:

-----cut---------

begin(
    print("well done"), 1)

----cut-------

So, there is a newline in the first line and after the '(' that should be eaten.

In my scanner.l I have

%%
[ \t]                       {   errorLineCol += strlen(yytext); }

\n                          {   errorLineNumber++;
                                errorLineCol = 0; }

("-"?[0-9])[0-9]*           {   errorLineCol += strlen(yytext);
                                yylval = stringToInteger(yytext);
                                return TINTEGER; }

.....

This then works for the file scenario but not for the interpreter. I the have to press and additional Ctrl+D after the ENTER. If I change to

\n                          {   errorLineNumber++;
                                errorLineCol = 0;
                                return 0; }

Then the interpreter works but not the file reading; which then stops after the first newline it encounters. What is a good way to tackle this issue?

Edit:

Here is the top level of the parser:

input: uexpr                        {   parseValue = $1; }
    | /* empty */                   {   parseValue = myNull; }
    | error                         {   parseValue = myNull; }
    ;

uexpr: list                          
    | atom                         
    ;

Possible Solution: seems to be to use

\n                          {   errorLineNumber++;
                                errorLineCol = 0;
                                if (yyin == stdin) return 0; }

解决方案

The main problem is that your parser function ypparse does not return until it reduces the entire language to the start symbol.

If the top level of your grammar is something like:

language : commands ;

commands : command commands | /* empty */ ;

of course the machine will expect a complete script (terminated by you hitting Ctrl-D). If your interpreter is this logic:

loop:
  print("prompt>")
  yyparse()
  if (empty statement)
    break

it won't work since yyparse is consuming the whole script before returning.

The return 0; solves the problem for this interactive mode because the token value 0 indicates EOF to the parser, making it think the script has ended.

I do not agree with the solution of making \n a token. It will only complicate the grammar (a hitherto insignificant piece of whitespace is now significant) and ultimately not work because the yyparse function will still want to process the complete grammar. That is to say, if you have newline as a token, but the grammar's start symbol represents the entire script, yyparse will still not return to your interactive prompt loop.

A quick and dirty hack is to let the lexer know whether interactive mode is in effect. Then it can conditionaly return 0; for every instance of a newline if it is in interactive mode. If the input isn't a complete statement, there will be a syntax error since the script as a whole ends at the newline. In normal file reading mode, your lexer can eats all whitespace without returning, as before allowing the whole file to be processed with a single yyparse.

If you want interactive input and file reading without implementing two modes of behavior in the lexer, what you can do is change the grammar so it only parses one statement of the language: the yyparse function returns for every top level statement of your language. (And the lexer eats newlines like before, no returning 0). I.e the start symbol of the grammar is just one statement (possibly empty). Then your file parser must be implemented as a loop (written by you) which calls yyparse to get all the statements from the file until yyparse encounters an empty input. The downside of this approach is that if the user types incomplete syntax (e.g. dangling open parenthesis), the parser will keep scanning the input until it is satisfied. This is unfriendly, like programs that use scanf for interactive user input (it's the same problem: scanf is a parser that doesn't return until it is satisified).

Another possibility is to have an interactive mode which performs its own user input rather than calling yyparse to get the input and parse it. In this mode, you read the user's input into a line buffer. Then you have the parser process the line buffer. To process a line buffer instead of a FILE * stream is perfectly possible. You just have to write custom input handling (your own definition of the YY_INPUT macro). This is the approach you will end up needing anyway if you implement a decent interactive mode with line editing and history recall, e.g. using libedit or GNU readline.

这篇关于对野牛的Flex新行扫描的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆