用unicode编写语法规则名称[ANTLR 4] [英] write a grammar rule name in unicode [ANTLR 4]
问题描述
我仍然是ANTLR 4的初学者,我想知道是否可以用unicode编写语法规则名称.例如,以下规则很好:
I am still a beginner in ANTLR 4 and I was wondering if there is a way to write a grammar rule name in unicode. For example, the following rule is fine:
atomExp returns [double value]
: n=Number {$value = Double.parseDouble($n.text);}
| '(' exp=additionExp ')' {$value = $exp.value;}
;
atomExp returns [double value]
: n=Number {$value = Double.parseDouble($n.text);}
| '(' exp=additionExp ')' {$value = $exp.value;}
;
但是,假设我要编写相同的规则,但是我不想将其名称写为"atomExp",而是将名称写为阿拉伯语单词تعبير"
However, let's say I want to write the same rule but instead of writing its name as "atomExp" , I want to write the name as an Arabic word "تعبير"
تعبير returns [double value]
: n=Number {$value = Double.parseDouble($n.text);}
| '(' exp=additionExp ')' {$value = $exp.value;}
;
تعبير returns [double value]
: n=Number {$value = Double.parseDouble($n.text);}
| '(' exp=additionExp ')' {$value = $exp.value;}
;
但是当我尝试以这种方式编写它时,出现没有可行的选择"错误.有人可以解决我的问题吗?预先感谢
but when I try to write it that way I get "no viable alternative" error. Can someone solve my problem please. Thanks in advance
推荐答案
When looking at the lexer grammar for ANTLR4, you can see that lexer and parser names support certain Unicode chars:
/** Allow unicode rule/token names */
ID : NameStartChar NameChar*;
fragment
NameChar
: NameStartChar
| '0'..'9'
| '_'
| '\u00B7'
| '\u0300'..'\u036F'
| '\u203F'..'\u2040'
;
fragment
NameStartChar
: 'A'..'Z'
| 'a'..'z'
| '\u00C0'..'\u00D6'
| '\u00D8'..'\u00F6'
| '\u00F8'..'\u02FF'
| '\u0370'..'\u037D'
| '\u037F'..'\u1FFF'
| '\u200C'..'\u200D'
| '\u2070'..'\u218F'
| '\u2C00'..'\u2FEF'
| '\u3001'..'\uD7FF'
| '\uF900'..'\uFDCF'
| '\uFDF0'..'\uFFFD'
; // ignores | ['\u10000-'\uEFFFF] ;
INT : [0-9]+
;
但是您的ID تعبير
似乎不符合ID
规则的NameChar*
部分.
But it appears that your ID تعبير
does not comply with the NameChar*
part of the ID
rule.
这篇关于用unicode编写语法规则名称[ANTLR 4]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!