不稳定的解析器.相同的语法,相同的输入将循环显示不同的结果.我想念什么? [英] Erratic parser. Same grammar, same input, cycles through different results. What am I missing?

查看:64
本文介绍了不稳定的解析器.相同的语法,相同的输入将循环显示不同的结果.我想念什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个基本的解析器,该解析器读取stdin形式并将结果打印到stdout.问题是我在语法上遇到了麻烦:

I'm writing a basic parser that reads form stdin and prints results to stdout. The problem is that I'm having troubles with this grammar:

%token WORD NUM TERM

%%
stmt: /* empty */
    | word word   term { printf("[stmt]\n"); }
    | word number term { printf("[stmt]\n"); }
    | word   term
    | number term
    ;
word: WORD  { printf("[word]\n"); }
    ;
number: NUM { printf("[number]\n"); }
      ;
term: TERM { printf("[term]\n"); /* \n */}
      ;

%%

当我运行程序时,我输入: hello world \ n .输出是(如我预期的那样) [word] [word] [term] [stmt] .到目前为止,还算不错,但是如果我再次输入: hello world \ n ,我会收到语法错误[word] [term] .
当我第三次键入 hello world \ n 时,它会工作,然后再次失败,然后它会工作,依此类推.

When I run the program, I and type: hello world\n The output is (as I expected) [word] [word] [term] [stmt]. So far, so good, but then if I type: hello world\n (again), I get syntax error [word][term].
When I type hello world\n (for the third time) it works, then it fails again, then it works, and so on and do forth.

我在这里错过明显的东西吗?

Am I missing something obvious in here?

(我有一些手工编译器的经验,但是我没有使用过lex/yacc等.)

(I have some experience on hand rolled compilers, but I've not used lex/yacc et. al.)

这是主要功能:

int main() {
    do {
        yyparse();
    } while(!feof(yyin));

    return 0;
}

任何帮助将不胜感激.谢谢!

Any help would be appreciated. Thanks!

推荐答案

您的语法可识别单个 stmt .Yacc/bison期望语法描述整个输入,因此在识别该语句之后,解析器将等待输入结束指示.但这并没有得到,因为您键入了第二条语句.这将导致解析器报告语法错误.但是请注意,它现在已读取第二行中的第一个令牌.

Your grammar recognises a single stmt. Yacc/bison expect the grammar to describe the entire input, so after the statement is recognised, the parser waits for an end-of-input indication. But it doesn't get one, since you typed a second statement. That causes the parser to report a syntax error. But note that it has now read the first token in the second line.

您正在循环调用 yyparse(),并且在获取语法错误返回值时不会停止.因此,当您再次调用 yyparse()时,它将继续从上一个中断的位置开始,该位置恰好位于第二行中的第二个标记之前.剩下的只是一个单词,然后可以对其进行正确解析.

You are calling yyparse() in a loop and not stopping when you get a syntax error return value. So when you call yyparse() again, it will continue where the last one left off, which is just before the second token in the second line. What remains is just a single word, which it then correctly parses.

您可能应该做的是编写解析器,以便它可以接受任意数量的语句,并可能使其不会在遇到错误时死亡.看起来像这样:

What you probably should do is write your parser so that it accepts any number of statements, and perhaps so that it does not die when it hits an error. That would look something like this:

%%
prog: %empty
    | prog line
line: stmt '\n'    { puts("Got a statement"); }
    | error '\n'   { yyerrok; /* Simple error recovery */ }
...

请注意,只有在知道正确解析了该行之后,才为语句打印一条消息.事实证明,这通常不会那么混乱.但是最好的解决方案不是使用printf,而是使用Bison的跟踪工具,这就像将 -t 放在bison命令行上并设置全局变量 yydebug = 1; .请参见跟踪解析器

Note that I print a message for a statement only after I know that the line was correctly parsed. That usually turns out to be less confusing. But the best solution is not use printf's, but rather to use Bison's trace facility, which is as simple as putting -t on the bison command line and setting the global variable yydebug = 1;. See Tracing your parser

这篇关于不稳定的解析器.相同的语法,相同的输入将循环显示不同的结果.我想念什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆