ANTLR4中单引号和双引号字符串的处理范围 [英] Handling scope for single and double quote strings in ANTLR4
问题描述
我正在使用ANTLR4,并且正在编写语法以处理单引号和双引号字符串.我正在尝试使用Lexer模式来限制字符串的范围,但这对我来说不起作用,下面列出了我的语法.这是正确的方法还是我该如何正确地将它们解析为标记,而不是具有上下文的解析器规则.有见识吗?
I am working with ANTLR4 and in the process of writing grammar to handle single and double quoted strings. I am trying to use Lexer modes to scope the strings but that is not working out for me, my grammar is listed below. Is this the right way or how can I properly parse these as tokens instead of parser rules with context. Any insight?
一个例子:
'single quote that contain "a double quote 'that has another single quote'"'
Lexer语法
lexer grammar StringLexer;
fragment SQUOTE: '\'';
fragment QUOTE: '"';
SQSTR_START: SQUOTE -> pushMode(SQSTR_MODE);
DQSTR_START: QUOTE -> pushMode(DQSTR_MODE);
CONTENTS: ~["\']+;
mode SQSTR_MODE;
SQSTR_END: (CONTENTS | DQSTR_START)+ SQUOTE -> popMode;
mode DQSTR_MODE;
DQSTR_END:(CONTENTS | SQSTR_START)+ QUOTE -> popMode;
解析器
parser grammar StringParser;
options { tokenVocab=StringLexer; }
start:
dqstr | sqstr
;
dqstr:
DQSTR_START DQSTR_END
;
sqstr:
SQSTR_START SQSTR_END
;
附录 感谢@Lucas Trzesniewski的答案.
ADDENDUM Thanks @Lucas Trzesniewski for an answer.
这是我编写的用于解析类壳语言的语法的一部分,我可以使用多行脚本来编写SQSTR
和DQSTR
.答案中提供了词法分析器规则,它将多行脚本合并在一起.
This is part of grammar I am writing to parse shell-like language, I could have multiple lines of script where they would have SQSTR
and DQSTR
. With the lexer rules provided in the answer it would lump multiple lines of script together.
快乐的例子(使用答案正确解析):
Happy case example (That get parsed correctly using the answer):
cmd 'single quote string'
cmd2 "double quote"
cmd3 'another single quote'
这被识别为三个命令和三个字符串(单和双)
This get recognized as three commands and three strings (single and double)
未分析的示例:另一方面,请注意单引号字符串中的引号:
Unparsed example: On the other hand - note the quote in the single quote strings:
cmd 'single "quote string'
cmd2 "double quote"
cmd3 'another "single quote'
在这种情况下,它会错误地将所有它们检测为SQSTR类型的单个字符串令牌.
In this case it would incorrectly detect all of them as a single string token of type SQSTR.
有什么想法可以解决这个问题吗?
Any ideas how to address this problem?
推荐答案
如果要将示例字符串解析为单个标记,则不必使用词法分析器模式,而可以使用相互递归的词法分析器规则:
If you want to parse your example string as a single token, you don't necessarily have to use lexer modes, you can use mutually-recursive lexer rules instead:
SQSTR : '\'' (~['"] | DQSTR)* '\'';
DQSTR : '"' (~['"] | SQSTR)* '"';
然后,在解析器中使用类似以下内容的内容:
Then, in the parser use something like:
str : SQSTR | DQSTR;
这篇关于ANTLR4中单引号和双引号字符串的处理范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!