Antlr4如何构建语法允许的关键字作为标识符 [英] Antlr4 how to build a grammar allowed keywords as identifier

查看:91
本文介绍了Antlr4如何构建语法允许的关键字作为标识符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是演示代码

label:
var id
let id = 10
goto label

如果允许的关键字作为标识符将是

If allowed keyword as identifier will be

let:
var var
let var = 10
goto let

这是完全合法的代码.但是在antlr中执行此操作似乎非常困难.

This is totally legal code. But it seems very hard to do this in antlr.

AFAIK,如果antlr与令牌let匹配,则永远不会回退到id令牌.因此对于antlr,它将看到

AFAIK, If antlr match a token let, will never fallback to id token. so for antlr it will see

LET_TOKEN :
VAR_TOKEN <missing ID_TOKEN>VAR_TOKEN
LET_TOKEN <missing ID_TOKEN>VAR_TOKEN = 10

尽管antlr允许谓词,但我必须控制令牌匹配和问题.语法变成这个

although antlr allowed predicate, I have to control ever token match and problematic. grammar become this

grammar Demo;
options {
  language = Go;
}
@parser::members{
    var _need = map[string]bool{}
    func skip(name string,v bool){
        _need[name] = !v
        fmt.Println("SKIP",name,v)
    }
    func need(name string)bool{
        fmt.Println("NEED",name,_need[name])
        return _need[name]
    }
}

proj@init{skip("inst",false)}: (line? NL)* EOF;
line
    : VAR ID
    | LET ID EQ? Integer
    ;

NL: '\n';
VAR: {need("inst")}? 'var' {skip("inst",true)};
LET: {need("inst")}? 'let' {skip("inst",true)};
EQ: '=';

ID: ([a-zA-Z] [a-zA-Z0-9]*);
Integer: [0-9]+;

WS: [ \t] -> skip;

看起来太可怕了.

但这很容易实现,请在 pegjs

But this is easy in peg, test this in pegjs

Expression = (Line? _ '\n')* ;

Line
  = 'var' _ ID
  / 'let' _ ID _ "=" _ Integer

Integer "integer"
  = [0-9]+ { return parseInt(text(), 10); }

ID = [a-zA-Z] [a-zA-Z0-9]*

_ "whitespace"
  = [ \t]*

我实际上是在 peggo javacc .

我的问题是如何在antlr4.6中处理这些语法,我对antlr4.6 go目标感到非常兴奋,但似乎我为我的语法选择了错误的工具?

My question is how to handle these grammars in antlr4.6, I was so excited about the antlr4.6 go target, but seems I choose the wrong tool for my grammar ?

推荐答案

最简单的方法是为标识符定义解析器规则:

The simplest way is to define a parser rule for identifiers:

id: ID | VAR | LET;

VAR: 'var';
LET: 'let';
ID: [a-zA-Z] [a-zA-Z0-9]*;

,然后在解析器规则中使用id代替ID.

And then use id instead of ID in your parser rules.

另一种方法是将ID用于标识符关键字,并将谓词用于歧义消除.但是它的可读性较差,因此我将使用第一种方法.

A different way is to use ID for identifiers and keywords, and use predicates for disambiguation. But it's less readable, so I'd use the first way instead.

这篇关于Antlr4如何构建语法允许的关键字作为标识符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆