为什么会出现这种野牛code产生无法预料的输出? [英] Why does this bison code produce unexpected output?

查看:172
本文介绍了为什么会出现这种野牛code产生无法预料的输出?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

柔性code:

  1 %option noyywrap nodefault yylineno case-insensitive
  2 %{
  3 #include "stdio.h"
  4 #include "tp.tab.h"
  5 %}
  6 
  7 %%
  8 "{"             {return '{';}
  9 "}"             {return '}';}
 10 ";"             {return ';';}
 11 "create"        {return CREATE;}
 12 "cmd"           {return CMD;}
 13 "int"           {yylval.intval = 20;return INT;}
 14 [a-zA-Z]+       {yylval.strval = yytext;printf("id:%s\n" , yylval.strval);return ID;}
 15 [ \t\n]
 16 <<EOF>>         {return 0;}
 17 .               {printf("mistery char\n");}
 18 

野牛code:

  1 %{
  2 #include "stdlib.h"
  3 #include "stdio.h"
  4 #include "stdarg.h"
  5 void yyerror(char *s, ...);
  6 #define YYDEBUG 1
  7 int yydebug = 1;
  8 %}
  9 
 10 %union{
 11     char *strval;
 12     int intval;
 13 }
 14 
 15 %token <strval> ID
 16 %token <intval> INT
 17 %token CREATE
 18 %token CMD
 19 
 20 %type <strval> col_definition
 21 %type <intval> create_type
 22 %start stmt_list
 23 
 24 %%
 25 stmt_list:stmt ';'
 26 | stmt_list stmt ';'
 27 ;
 28 
 29 stmt:create_cmd_stmt         {/*printf("create cmd\n");*/}
 30 ;
 31 
 32 create_cmd_stmt:CREATE CMD ID'{'create_col_list'}'    {printf("%s\n" , $3);}
 33 ;
 34 create_col_list:col_definition
 35 | create_col_list col_definition
 36 ;
 37 
 38 col_definition:create_type ID ';' {printf("%d , %s\n" , $1, $2);}
 39 ;
 40 
 41 create_type:INT {$$ = $1;}
 42 ;
 43 
 44 %%
 45 extern FILE *yyin;
 46 
 47 void
 48 yyerror(char *s, ...)
 49 {
 50     extern yylineno;
 51     va_list ap;
 52     va_start(ap, s);
 53     fprintf(stderr, "%d: error: ", yylineno);
 54     vfprintf(stderr, s, ap);
 55     fprintf(stderr, "\n");
 56 }
 57 
 58 int main(int argc , char *argv[])
 59 {
 60     yyin = fopen(argv[1] , "r");
 61     if(!yyin){
 62         printf("open file %s failed\n" ,argv[1]);
 63         return -1;
 64     }
 65 
 66     if(!yyparse()){
 67         printf("parse work!\n");
 68     }else{
 69         printf("parse failed!\n");
 70     }
 71 
 72     fclose(yyin);
 73     return 0;
 74 }
 75

测试输入文件:

create cmd keeplive
{
    int a;
    int b;
};

测试输出:

root@VM-Ubuntu203001:~/test/tpp# ./a.out t1.tp 
id:keeplive
id:a
20 , a;
id:b
20 , b;
keeplive
{
    int a;
    int b;
}
parse work!

我有两个问题:

1)为什么在行的动作38打印标记';'?例如,20,;和20,B;

1) Why does the action at line 38 print the token ';'? For instance, "20 , a;" and "20 , b;"

2)为什么在32行打印不动作keeplive
    {
        int类型的;
        INT B:
    },而不是简单的keeplive?

2) Why does the action at line 32 print "keeplive { int a; int b; }" instead of simply "keeplive"?

推荐答案

简短的回答:

yylval.strval = yytext;

您不能使用 yytext中这样。它所指向的字符串是专用于词法分析器,并会尽快弯曲动作完成更改。你需要做的是这样的:

You can't use yytext like that. The string it points to is private to the lexer and will change as soon as the flex action finishes. You need to do something like:

yylval.strval = strdup(yytext);

,然后你需要确保你算账释放内存。

and then you need to make sure you free the memory afterwards.

再回应:

yytext中实际上是一个指针到包含输入缓冲区。为了使yytext中工作,就好像是一个NULL结尾的字符串时,弯曲框架覆盖下使用 NUL标记字符它做动作之前,然后替换原来的字符。因此,的strdup 做工精细的动作在里面,但动作(在你的野牛code)外,你现在有一个指向缓冲区的部分先从令牌。而且还有更糟糕的版本,因为弯曲将读取源的下一部分到相同的缓冲,现在你的指针是随机的垃圾。有几种可能的方案中,根据弯曲选项,但他们都不是pretty。

yytext is actually a pointer into the buffer containing the input. In order to make yytext work as though it were a NUL-terminated string, the flex framework overwrites the character following the token with a NUL before it does the action, and then replaces the original character when the action terminates. So strdup will work fine inside the action, but outside the action (in your bison code), you now have a pointer to the part of the buffer starting with the token. And it gets worse later, since flex will read the next part of the source into the same buffer, and now your pointer is to random garbage. There are several possible scenarios, depending on flex options, but none of them are pretty.

因此​​黄金法则: yytext中才有效,直至行动结束。如果你想保持它,复制它,然后确保你免费为副本存储,当你不再需要它。

So the golden rule: yytext is only valid until the end of the action. If you want to keep it, copy it, and then make sure you free the storage for the copy when you no longer need it.

在几乎所有我写的词法分析器,该ID令牌居然发现一个符号表的标识符(或把它放在这里),并返回一个指针到符号表中,从而简化了内存管理。但你仍然有基本相同的内存管理的问题与,例如,字符串文字。

In almost all the lexers I've written, the ID token actually finds the identifier in a symbol table (or puts it there) and returns a pointer into the symbol table, which simplifies memory management. But you still have essentially the same memory management issue with, for example, character string literals.

这篇关于为什么会出现这种野牛code产生无法预料的输出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆