解析fortran样式的.op.经营者 [英] Parsing fortran-style .op. operators

查看:75
本文介绍了解析fortran样式的.op.经营者的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图为一个受fortran启发的DSL编写ANTLR4语法.我在使用"ole经典"".op"时遇到了困难.运算符:

I'm trying to write an ANTLR4 grammar for a fortran-inspired DSL. I'm having difficulty with the 'ole classic ".op." operators:

if (1.and.1) then

其中两个"1"都应解释为整数.我看了看OpenFortranParser以获得洞察力,但是我没有任何道理.

where both "1"s should be intepreted as integer. I looked at the OpenFortranParser for insight, but I can't make sense out of it.

最初,我在词法分析器中为INTEGER和REAL定义了合适的定义.因此,无论我尝试什么,上面的第一个"1"始终被解析为REAL.我尝试将其移动到解析器中,并使其可靠地识别出".and".以及围绕它的数字,以适当的整数或实数表示.

Initially, I had suitable definitions for INTEGER and REAL in my lexer. Consequently, the first "1" above always parsed as a REAL, no matter what I tried. I tried moving things into the parser, and got it to the point where I could reliably recognize the ".and." along with numbers around it as appropriately INTEGER or REAL.

if (1.and.1)   # INT/INT
if (1..and..1) # REAL/REAL

...等等...

我当然想在这样的语句中识别变量名:

I of course want to recognize variable-names in such statements:

if (a.and.b)

,并具有适当的ID规则.但是,在下面的小语法中,引号中的所有文字(例如,ex,"and","if",所有单字符数字后缀)都不能作为ID接受,并且出现错误;可接受其他任何符合ID的字符串:

and have an appropriate rule for ID. In the small grammar below, however, any literals in quotes (ex, 'and', 'if', all the single-character numerical suffixes) are not accepted as an ID, and I get an error; any other ID-conforming string is accepted:

if (a.and.b)  # errs, as 'b' is valid INTEGER suffix
if (a.and.c)  # OK

有关此行为的任何见解,或有关如何解析.op的更好建议. fortran中的运算符将不胜感激-谢谢!

Any insights into this behavior, or better suggestions on how to parse the .op. operators in fortran would be greatly appreciated -- Thanks!

grammar Foo;

start  : ('if' expr | ID)+ ;

DOT : '.' ;

DIGITS: [0-9]+;

ID : [a-zA-Z0-9][a-zA-Z0-9_]* ;

andOp : DOT 'and' DOT ;

SIGN : [+-];

expr     
    : ID
    | expr andOp expr
    | numeric
    | '(' expr ')'
    ;

integer : DIGITS ('q'|'Q'|'l'|'L'|'h'|'H'|'b'|'B'|'i'|'I')? ;

real    
    : DIGITS DOT DIGITS? (('e'|'E') SIGN? DIGITS)? ('d' | 'D')?
    |        DOT DIGITS  (('e'|'E') SIGN? DIGITS)? ('d' | 'D')?
    ;

numeric : integer | real;

EOLN  : '\r'? '\n' -> skip;

WS    :  [ \t]+ -> skip;   

推荐答案

要消除DOT的歧义,请在DOT规则之前添加带有谓词的词法分析器规则.

To disambiguate DOT, add a lexer rule with a predicate just before the DOT rule.

DIT : DOT { isDIT() }? ;
DOT : '.' ;

更改"andOp"

andOp : DIT 'and' DIT ;

然后添加谓词方法

@lexer::members {

public boolean isDIT() {
    int offset = _tokenStartCharIndex;
    String r = _input.getText(Interval.of(offset-4, offset));
    String s = _input.getText(Interval.of(offset, offset+4));
    if (".and.".equals(s) || ".and.".equals(r)) {
        return true;
    }
    return false;
}

}

但是,这并不是您当前问题的真正根源.整数解析器规则有效地在词法分析器之外定义词法分析器常量,这就是为什么'b'与ID不匹配的原因.

But, that is not really the source of your current problem. The integer parser rule defines lexer constants effectively outside of the lexer, which is why 'b' is not matched to an ID.

将其更改为

integer : INT ;

INT:  DIGITS ('q'|'Q'|'l'|'L'|'h'|'H'|'b'|'B'|'i'|'I')? ;

,词法分析器将找出其余的内容.

and the lexer will figure out the rest.

这篇关于解析fortran样式的.op.经营者的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆