解析 fortran 风格的 .op.运营商 [英] Parsing fortran-style .op. operators

查看:22
本文介绍了解析 fortran 风格的 .op.运营商的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为受 Fortran 启发的 DSL 编写 ANTLR4 语法.我在使用 'ole 经典.op"时遇到困难.运营商:

I'm trying to write an ANTLR4 grammar for a fortran-inspired DSL. I'm having difficulty with the 'ole classic ".op." operators:

if (1.and.1) then

其中两个1"都应该被解释为整数.我查看了 OpenFortranParser 以获得洞察力,但我无法理解它.

where both "1"s should be intepreted as integer. I looked at the OpenFortranParser for insight, but I can't make sense out of it.

最初,我在词法分析器中对 INTEGER 和 REAL 有合适的定义.因此,无论我尝试什么,上面的第一个1"总是被解析为 REAL.我尝试将内容移入解析器,并使其能够可靠地识别.and".以及它周围的数字作为适当的整数或实数.

Initially, I had suitable definitions for INTEGER and REAL in my lexer. Consequently, the first "1" above always parsed as a REAL, no matter what I tried. I tried moving things into the parser, and got it to the point where I could reliably recognize the ".and." along with numbers around it as appropriately INTEGER or REAL.

if (1.and.1)   # INT/INT
if (1..and..1) # REAL/REAL

...等等...

我当然想在这样的语句中识别变量名:

I of course want to recognize variable-names in such statements:

if (a.and.b)

并有适当的 ID 规则.但是,在下面的小语法中,引号中的任何文字(例如,'and'、'if',所有单字符数字后缀)都不能作为 ID 接受,并且我收到错误消息;接受任何其他符合 ID 的字符串:

and have an appropriate rule for ID. In the small grammar below, however, any literals in quotes (ex, 'and', 'if', all the single-character numerical suffixes) are not accepted as an ID, and I get an error; any other ID-conforming string is accepted:

if (a.and.b)  # errs, as 'b' is valid INTEGER suffix
if (a.and.c)  # OK

对此行为的任何见解,或有关如何解析 .op 的更好建议.Fortran 中的运算符将不胜感激 -- 谢谢!

Any insights into this behavior, or better suggestions on how to parse the .op. operators in fortran would be greatly appreciated -- Thanks!

grammar Foo;

start  : ('if' expr | ID)+ ;

DOT : '.' ;

DIGITS: [0-9]+;

ID : [a-zA-Z0-9][a-zA-Z0-9_]* ;

andOp : DOT 'and' DOT ;

SIGN : [+-];

expr     
    : ID
    | expr andOp expr
    | numeric
    | '(' expr ')'
    ;

integer : DIGITS ('q'|'Q'|'l'|'L'|'h'|'H'|'b'|'B'|'i'|'I')? ;

real    
    : DIGITS DOT DIGITS? (('e'|'E') SIGN? DIGITS)? ('d' | 'D')?
    |        DOT DIGITS  (('e'|'E') SIGN? DIGITS)? ('d' | 'D')?
    ;

numeric : integer | real;

EOLN  : '\r'? '\n' -> skip;

WS    :  [ \t]+ -> skip;   

推荐答案

要消除 DOT 的歧义,请在 DOT 规则之前添加带有谓词的词法分析器规则.

To disambiguate DOT, add a lexer rule with a predicate just before the DOT rule.

DIT : DOT { isDIT() }? ;
DOT : '.' ;

改变'andOp'

andOp : DIT 'and' DIT ;

然后添加谓词方法

@lexer::members {

public boolean isDIT() {
    int offset = _tokenStartCharIndex;
    String r = _input.getText(Interval.of(offset-4, offset));
    String s = _input.getText(Interval.of(offset, offset+4));
    if (".and.".equals(s) || ".and.".equals(r)) {
        return true;
    }
    return false;
}

}

但是,这并不是您当前问题的真正根源.整数解析器规则在词法分析器之外有效地定义了词法分析器常量,这就是为什么b"与 ID 不匹配的原因.

But, that is not really the source of your current problem. The integer parser rule defines lexer constants effectively outside of the lexer, which is why 'b' is not matched to an ID.

改成

integer : INT ;

INT:  DIGITS ('q'|'Q'|'l'|'L'|'h'|'H'|'b'|'B'|'i'|'I')? ;

然后词法分析器会找出其余的.

and the lexer will figure out the rest.

这篇关于解析 fortran 风格的 .op.运营商的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆