在野牛/ yacc语法无意级联 [英] Unintentional concatenation in Bison/Yacc grammar

查看:164
本文介绍了在野牛/ yacc语法无意级联的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我与lex和yacc试验,并已运行到一个奇怪的问题,但我认为这将是最好的细节问题之前,向你展示我的code。这是我的词法分析器:

I am experimenting with lex and yacc and have run into a strange issue, but I think it would be best to show you my code before detailing the issue. This is my lexer:

%{
#include <stdlib.h>
#include <string.h>
#include "y.tab.h"
void yyerror(char *);
%}

%%

[a-zA-Z]+ {
  yylval.strV = yytext;
  return ID;
}

[0-9]+      {
  yylval.intV = atoi(yytext);
  return INTEGER;
}

[\n] { return *yytext; }

[ \t]        ;

. yyerror("invalid character");

%%

int yywrap(void) {
  return 1;
}

这是我的解析器:

%{
#include <stdio.h>

int yydebug=1;
void prompt();
void yyerror(char *);
int yylex(void);
%}

%union {
  int intV;
  char *strV;
}

%token INTEGER ID

%%

program: program statement EOF { prompt(); }
       | program EOF { prompt(); }
       | { prompt(); }
       ;

args: /* empty */
    | args ID { printf(":%s ", $<strV>2); }
    ;

statement: ID args { printf("%s", $<strV>1); }
         | INTEGER { printf("%d", $<intV>1); }
;

EOF: '\n'

%%

void yyerror(char *s) {
  fprintf(stderr, "%s\n", s);
}

void prompt() {
  printf("> ");
}

int main(void) {
  yyparse();
  return 0;
}

有一个非常简单的语言,包括不超过字符串和整数和基本REPL多。现在,您将在解析器注意, ARGS 的是与一家领先的冒号输出,其用意是,当与的的语句的该规则的第一个模式相结合与REPL互动将是这个样子:

A very simple language, consisting of no more than strings and integer and a basic REPL. Now, you'll note in the parser that args are output with a leading colon, the intention being that, when combined with the first pattern of the rule of the statement the interaction with the REPL would look something like this:

> aaa aa a
:aa :a aaa>

然而,相互作用是这样的:

However, the interaction is this:

> aaa aa a
:aa :a aaa aa aa
>

为什么在以下规则标记ID

Why does the token ID in the following rule

statement: ID args { printf("%s", $<strV>1); }
         | INTEGER { printf("%d", $<intV>1); }
;

有总输入字符串的语义值,包括换行?如何让我的语法进行返工,这样我预期的互动?

have the semantic value of the total input string, newline included? How can my grammar be reworked so that the interaction I intended?

推荐答案

您必须preserve令牌字符串作为,如果你想让他们仍然有效,他们被读取。我修改了语句规则为:

You have to preserve token strings as they are read if you want them to remain valid. I modified the statement rule to read:

statement: ID { printf("<%s> ", $<strV>1); } args { printf("%s", $<strV>1); }
         | INTEGER { printf("%d", $<intV>1); }
;

然后,你的投入,我得到的输出:

Then, with your input, I get the output:

> aaa aa a
<aaa> :aa :a aaa aa a
>

请注意,在初始ID被读出的时间,该令牌是你预期什么。但是,因为你没有preserve令牌,字符串已经被你回来后打印出来的时候修改了 ARGS 被解析。

Note that at the time the initial ID is read, the token is exactly what you expected. But, because you did not preserve the token, the string has been modified by the time you get back to printing it after the args have been parsed.

这篇关于在野牛/ yacc语法无意级联的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆