Antlr 贪婪选项 [英] Antlr greedy-option

查看:30
本文介绍了Antlr 贪婪选项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(我根据@Bart Kiers 的第一条评论编辑了我的问题 - 谢谢!)

(I edited my question based on the first comment of @Bart Kiers - thank you!)

我有以下语法:

SPACE : (' '|'\t'|'\n'|'\r')+ {$channel = HIDDEN;};
START : 'START:';
STRING_LITERAL  : ('"' .* '"')+;
rule    :  START STRING_LITERAL;

我想解析如下语言:'START: "abcd" START: "img src="test.jpg""'(字符串文字可以在字符串文字内).
如果字符串文字中有字符串文字,则上面定义的语法不起作用,因为对于语言 'START: "img src="test.jpg""' 词法分析器将其转换为以下标记: START('START:')STRING_LITERAL("img src=") test.jpg.
有什么方法可以定义适合我的问题的语法吗?

and I want to parse languages like: 'START: "abcd" START: "img src="test.jpg""' (string literals could be inside string literals).
The grammar defined above does not work if there are string literals inside a string literal because for the language 'START: "img src="test.jpg""' the lexer translates it into the following tokens: START('START:') STRING_LITERAL("img src=") test.jpg.
Is there any way to define a grammar which is fine for my problem?

推荐答案

这里有几个错误:

  • 您不能在解析器规则中使用 fragment 规则.您的语法永远不会创建 START 标记;
  • a . char (DOT-char) 在解析器规则中匹配任何标记,而在词法分析器规则中,它匹配任何字符;
  • 如果您让 .* 贪婪地匹配(并且您已经定义了一个匹配字符串文字的正确词法分析器规则),则输入 START: "abcd" START: "img src="test.jpg"" 然后会在其中包含一个大字符串: "abcd" START: "img src="test.jpg"" (第一个和最后一个引号将匹配).
  • you cannot use fragment rules inside parser rules. You grammar will never create a START token;
  • a . char (DOT-char) inside a parser rule matches any token, while inside a lexer rule, it matches any character;
  • if you let .* match greedily (and you had defined a proper lexer rule that matches a string literal), the input START: "abcd" START: "img src="test.jpg"" would then have one large string in it: "abcd" START: "img src="test.jpg"" (the first and the last quote would be matched).

因此,您不能使用相同的引号将字符串文字嵌入到字符串文字中.词法分析器无法确定引号是否意味着关闭字符串,或者它是否是(新)嵌入字符串的开始.你需要在你的语法中改变它.

So, you cannot embed string literals inside string literals using the same quotes. The lexer is not able to determine if a quote is meant to close the string, or if it's the start of a (new) embedded string. You will need to change that in your grammar.

这篇关于Antlr 贪婪选项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆