ANTLR 正则表达式中的范围量词语法 [英] Range quantifier syntax in ANTLR Regex
问题描述
这应该相当简单.我正在使用 ANTLR 处理词法分析器语法,并希望将变量标识符的最大长度限制为 30 个字符.我试图用这一行来完成这个(遵循正常的正则表达式 - 除了 '' 东西 - 语法):
This should be fairly simple. I'm working on a lexer grammar using ANTLR, and want to limit the maximum length of variable identifiers to 30 characters. I attempted to accomplish this with this line(following normal regex - except for the '' thing - syntax):
ID : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'_'){0,29} {System.out.println("IDENTIFIER FOUND.");}
;
代码生成中没有错误,但是由于生成的代码中的一行代码导致编译失败:
No errors in code generation, but compilation failed due to a line in the generated code that was simply:
0,29
显然 antlr 将括号之间的文本部分与打印行一起放置在接受状态区域中.我搜索了 ANTLR 站点,但没有找到任何示例或对等效表达式的引用.这个表达式的语法应该是什么?
Obviously antlr is taking the section of text between the brackets and placing it in the accept state area along with the print line. I searched the ANTLR site, and I found no example or reference to an equivalent expression. What should the syntax of this expression be?
推荐答案
ANTLR 不支持 {m,n}
量词语法.ANTLR 会看到您的量词的 {}
,但无法将它们与围绕您的操作的 {}
区分开来.
ANTLR does not support the {m,n}
quantifier syntax. ANTLR sees the {}
of your quantifier and can't tell them apart from the {}
that surround your actions.
解决方法:
- 在语义上强制执行限制.让它收集一个无限大小的 ID,然后将其作为操作代码的一部分或稍后在编译器中抱怨/截断.
- 手动创建量化规则.
这是将 ID 限制为 8 个的手动规则示例.
This is an example of a manual rule that limits IDs to 8.
SUBID : ('a'..'z'|'A'..'Z'|'0'..'9'|'_')
;
ID : ('a'..'z'|'A'..'Z')
(SUBID (SUBID (SUBID (SUBID (SUBID (SUBID SUBID?)?)?)?)?)?)?
;
就我个人而言,我会选择语义解决方案 (#1).如今,几乎没有理由限制语言中的标识符,更没有理由在违反此类规则时导致语法错误(编译的早期中止).
Personally, I'd go with the semantic solution (#1). There is very little reason these days to limit the identifiers in a language, and even less reason to cause a syntax error (early abort of the compile) when such a rule is violated.
这篇关于ANTLR 正则表达式中的范围量词语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!