我们可以使用ANTLR定义非上下文语法吗? [英] Can we define a non context-free grammar with ANTLR?

查看:228
本文介绍了我们可以使用ANTLR定义非上下文语法吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对ANTLR4很新,现在我正在尝试用它来定义哪种语法。

I'm pretty new to ANTLR4 and now I'm trying to undertand which kind of grammars we might define with it.

据我所知,ANTLR中有两种规则: 解析器规则 (小写)单词)和 词法规则 (大写单词)。示例:

As far as I got, there're two kind of rules in ANTLR: parser rules (lower case words) and lexer rules (upper-case words). Example:

grammar Test;

init: prog(','prog)*;

prog: A
     | prog
     ;

A: [a-z]+;

形成语法生成规则的立场我会说解析器规则是NON-TERMINAL符号,可以替换使用词法分析器规则定义的一系列标记。

Form the grammar production rule standpoint I would say that parser rules are NON-TERMINAL symbols which can be replaced with a sequence of tokens defined by a lexer rules.

因此,语法非常清楚定义。语法生成的语言的alpahbet包含由小写拉丁字母组成的所有单词。

So, it's perfectly clear that the grammar is context-free by the definition. The alpahbet of the language generated by the grammar consists of all words making up from lower-case latin letter.

问题: 我们可以使用 ANTLR4 定义非上下文语法吗?

Question: Can we define a non context-free grammar using ANTLR4?

推荐答案

是。(咳嗽)。

据我了解,您可以在规则中添加代码。任意代码可以测试任意事物,所以答案是是。一般来说,我认为你不能用ANTLR很好地做到这一点,但这对于许多有趣的特殊情况非常实用(例如,接受除素数之外的所有数字字符串)。

It is my understanding that you can add code to the rules. Arbitrary code can test arbitrary things, so the answer is "yes". In general, I don't think you can do this well with ANTLR, but this is pretty practical for lots of interesting special cases (e.g., accept all digit strings except those that are prime numbers).

NO。

我认为如果您坚持使用ANTLR允许的语法规范,答案是否 。实际上,有一些无上下文的语法可以用ANTLR指定它无法正确处理,大多数解析器生成器也是如此。 (对于ANTLR,这包括具有间接左递归,歧义,任意前瞻等的语法。)我们甚至通过其限制的名称调用大多数这些解析器生成器,例如LL(1),LALR(k)等。 。

I think if you stick to the the syntax specification allowed by ANTLR, the answer is "no". In fact, there are context-free grammars that you can "specify" with ANTLR that it cannot process correctly, which is true of most parser generators. (For ANTLR, this includes grammars with indirect left recursion, ambiguity, arbitrary lookahead, etc.) We even call most of these parser generators by the names of their "limitations", e.g., LL(1), LALR(k), etc.

哪些可以完全上下文?

一些解析器生成器可以处理完整的,无上下文的语法。我想到了Earley和CYK解析器,但它们并不是很快,所以人们往往会避免使用它们。 GLR解析器可以做到这一点(我们在我们的工具中使用它,因为它真的有助于为真实语言编写语法[参见我的生物]但是有一些语法使它们很慢;你可以大多避免这些。显然GLL解析方案存在并且是也完全没有上下文;我希望它们在某些钝语法中也有性能问题,但在实践中也很有用。

A few parser generators can handle full-, context free grammars. Earley and CYK parsers come to mind, but they aren't very fast so people tend to avoid using them. GLR parsers can do this (we use this in our tools because it really aids writing grammars for real languages [see my bio] but there are some grammars that make them pretty slow; you can mostly avoid these. Apparently GLL parsing schemes exist and are also full context free; I'd expect them to have performance troubles with some obtuse grammars too, but also to be pretty usable in practice.

我唯一的解析器生成器听说过可以做各种上下文敏感的语法的是 MetaS 。我从来没用过它,但它背后的理论非常令人印象深刻。声称它可以做任意的上下文敏感的语法;对于任意讨厌的语法,它会遇到极高的成本,但实际上这并不是实际的反对意见。

The only parser generator I've heard of which can do a variety of context-sensitive grammars is MetaS. I've never used it, but the theory behind it is pretty impressive. The claim is that it can do arbitrary context-sensitive grammars; it will run into extremely high costs for arbitrarily nasty grammars, but that's actually not an objection in practice.

这篇关于我们可以使用ANTLR定义非上下文语法吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆