野牛语义类型检查分析 [英] semantic type checking analysis in bison

查看:127
本文介绍了野牛语义类型检查分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试到处寻找示例,但是却徒劳无功.

I've been trying to find examples everywhere but it's been in vain.

我正在尝试编写一个基本的Ruby解释器.为此,我编写了一个弹性词法文件,其中包含标记识别语句和一个语法文件.

I am trying to write a basic Ruby interpreter. For this, I wrote a flex lexical file, containing token recognition sentences, and a grammar file.

我希望我的语法包含语义类型检查.

I wish for my grammar to contain semantic type checking.

我的语法文件包含例如:

My grammar file contains, for example:

arg : arg '+' arg 

这应该是整数和浮点数的有效规则.

This should be a valid rule for integers and floats.

根据我所阅读的内容,我可以为arg之类的非终端指定类型,如下所示:

According to what I've read, I can specify type for a non terminal such as arg, like so:

%type <intval> arg

其中"intval"位于联合类型中,并且与int C类型相对应.

where "intval" is in the type union and corresponds to the int C type.

但这仅用于整数,我不确定如何使该规则对浮点数有效. 我考虑过要制定两个不同的规则,一个用于整数,一个用于浮点数,例如:

But this is only for integers, I am not sure how to make the rule valid for, say, floats. I thought about having two different rules, one for ints and one for floats, like:

argint : argint '+' argint
argfloat : argfloat '+' argfloat

但是我敢肯定,这样做的方法要好得多,因为这种残酷性要求我制定规则,允许在浮点数和整数之间添加加法.

but I am sure there is a much, much better way of doing so, since this atrocity would require me to have rules to allow additions between floats and ints.

我发现的所有示例都只有一种类型(在类似计算器的示例中通常为整数).

All examples I've found have only one type (usually integers in calculator-like examples).

如何实现指定诸如加法之类的规则可以将整数和浮点数作为参数的规定?

How can I achieve specifying that a rule such as an addition can have ints and floats as arguments?

非常感谢您.

推荐答案

这不是您想要的答案.我认为您没有看到想要的示例的原因是,在语法文件(.y)中强制执行键入规则是不切实际的.而是,开发人员通过程序.c或.cpp代码完成此操作.通常,无论如何,您都会对已解析的输入进行一些分析,因此,在执行语义规则时,这是副产品.

This isn't the answer you're hoping for. I think the reason that you haven't seen examples of what you want is that it's impractical to enforce typing rules in the grammar file (the .y); rather, developers accomplish this in procedural .c or .cpp code. Generally, you will have do some analysis of the parsed input anyway, so it's a byproduct to enforce the semantic rules as you do so.

顺便说一句,鉴于您在问题中所产生的语法片段,我不太理解您如何解析表达式.

As an aside, I don't quite understand how you are parsing expressions, given the fragment of your grammar that you reproduce in your question.

这就是为什么我声称这是不切实际的. (1)您的类型信息必须渗透到整个语法的非末尾. (2)更糟糕的是,它必须反映在变量名中.

Here's why I claim that it's impractical. (1) Your type information has to percolate all through the non-terminals of the grammar. (2) Worse, it has to be reflected in variable names.

考虑这个玩具示例,该示例分析可使用标识符,数字常量和四个桌面计算器运算符的简单赋值语句. NUMBER令牌可以是整数(例如42)或浮点数(例如3.14).假设IDENTIFIER是一个字母,即A-Z.

Consider this toy example of parsing simple assignment statements that can use identifiers, numeric constants, and the four desk calculator operators. The NUMBER token can be an integer like 42 or a float like 3.14. And let's say that an IDENTIFIER is one letter, A-Z.

%token IDENTIFIER NUMBER

%%

stmt : IDENTIFIER '=' expr
     ;

expr : expr '+' term
     | expr '-' term
     | term
     ;

term : term '*' factor
     | term '/' factor
     | factor
     ;

factor : '(' expr ')'
       | '-' factor
       | NUMBER
       | IDENTIFIER
       ;

现在让我们尝试介绍打字规则.我们将NUMBER令牌分为FLT_NUMBER和INT_NUMBER.我们的exprtermfactor非终端也分为两个:

Now let's try to introduce typing rules. We'll separate the NUMBER token into FLT_NUMBER and INT_NUMBER. Our expr, term, and factor non-terminals split into two as well:

%token IDENTIFIER FLT_NUMBER INT_NUMBER

stmt : IDENTIFIER '=' int_expr
     | IDENTIFIER '=' flt_expr
     ;

int_expr : int_expr '+' int_term
         | int_expr '-' int_term
         | int_term
         ;

flt_expr : flt_expr '+' flt_term
         | flt_expr '-' flt_term
         | flt_term
         ;

int_term : int_term '*' int_factor
         | int_term '/' int_factor
         | int_factor
         ;

flt_term : flt_term '*' flt_factor
         | flt_term '/' flt_factor
         | flt_factor
         ;

int_factor : '(' int_expr ')'
           | '-' int_factor
           | INT_NUMBER
           | int_identifier
           ;

flt_factor : '(' flt_expr ')'
           | '-' flt_factor
           | FLT_NUMBER
           | flt_identifier
           ;

int_identifier : IDENTIFIER ;

flt_identifier : IDENTIFIER ;

就目前的语法而言,这是有冲突的:解析器无法判断是将IDENTIFIER识别为int_identifier还是flt_identifier.因此,它不知道是将A = B还原为IDENTIFIER = int_expr还是IDENTIFIER = flt_expr.

As our grammar stands at this point, there's a conflict: the parser can't tell whether to recognize an IDENTIFIER as a int_identifier or a flt_identifier. So it doesn't know whether to reduce A = B as IDENTIFIER = int_expr or IDENTIFIER = flt_expr.

(在这里,我对Ruby的理解有些柔和:) Ruby(像大多数语言一样)没有提供词汇级别的方法来确定标识符的数字类型.将其与老式BASIC进行对比,其中A表示数字,A $表示字符串.换句话说,如果您发明了一种语言,例如A#代表整数,A @代表浮点数,那么您可以完成这项工作.

(Here's where my understanding of Ruby is a little soft:) Ruby (like most languages) doesn't provide a way at the lexical level to determine the numeric type of an identifier. Contrast this with old school BASIC, where A denotes a number and A$ denotes a string. In other words, if you invented a language where, say, A# denotes an integer and A@ denotes a float, then you could make this work.

如果您想允许有限的混合类型表达式(例如int_term '*' flt_factor),则语法会变得更加复杂.

If you wanted to permit limited mixed-type expressions, like an int_term '*' flt_factor, then your grammar would get even more complicated.

可能有一些方法可以解决这些问题.使用yacc/bison以外的技术构建的解析器可能会使其变得更容易.至少,也许我的草图可以为您提供一些进一步追求的想法.

There might be ways to work around these issues. A parser built from technology other than yacc/bison might make it easier. At the least, perhaps my sketch will give you some ideas to pursue further.

这篇关于野牛语义类型检查分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆