我们可以用 ANTLR 定义一个非上下文无关文法吗? [英] Can we define a non context-free grammar with ANTLR?

查看:24
本文介绍了我们可以用 ANTLR 定义一个非上下文无关文法吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 ANTLR4 还是很陌生,现在我正在尝试了解我们可以用它定义哪种语法.

I'm pretty new to ANTLR4 and now I'm trying to undertand which kind of grammars we might define with it.

据我所知,ANTLR 中有两种规则:解析器规则(小写单词)和 词法分析器规则(大写单词).示例:

As far as I got, there're two kind of rules in ANTLR: parser rules (lower case words) and lexer rules (upper-case words). Example:

grammar Test;

init: prog(','prog)*;

prog: A
     | prog
     ;

A: [a-z]+;

从语法产生式规则的角度来看,我会说解析器规则是非终止符号,可以用词法分析器规则定义的一系列标记替换.

Form the grammar production rule standpoint I would say that parser rules are NON-TERMINAL symbols which can be replaced with a sequence of tokens defined by a lexer rules.

因此,根据定义,很明显语法是上下文无关的.语法生成的语言字母表由所有由小写拉丁字母组成的单词组成.

So, it's perfectly clear that the grammar is context-free by the definition. The alpahbet of the language generated by the grammar consists of all words making up from lower-case latin letter.

问题:我们可以使用 ANTLR4 定义一个非上下文无关的语法吗?

Question: Can we define a non context-free grammar using ANTLR4?

推荐答案

YES.(咳嗽).

据我了解,您可以向规则中添加代码.任意代码可以测试任意事物,所以答案是是".一般来说,我不认为你可以用 ANTLR 很好地做到这一点,但是这对于许多有趣的特殊情况非常实用(例如,接受除质数之外的所有数字字符串).

It is my understanding that you can add code to the rules. Arbitrary code can test arbitrary things, so the answer is "yes". In general, I don't think you can do this well with ANTLR, but this is pretty practical for lots of interesting special cases (e.g., accept all digit strings except those that are prime numbers).

否.

我认为如果您坚持 ANTLR 允许的语法规范,答案是否定的.事实上,您可以使用 ANTLR 指定"一些上下文无关文法,它无法正确处理,大多数解析器生成器都是如此.(对于 ANTLR,这包括具有间接左递归、歧义、任意前瞻等的语法.)我们甚至通过它们的限制"的名称来调用大多数这些解析器生成器,例如,LL(1)、LALR(k) 等.

I think if you stick to the the syntax specification allowed by ANTLR, the answer is "no". In fact, there are context-free grammars that you can "specify" with ANTLR that it cannot process correctly, which is true of most parser generators. (For ANTLR, this includes grammars with indirect left recursion, ambiguity, arbitrary lookahead, etc.) We even call most of these parser generators by the names of their "limitations", e.g., LL(1), LALR(k), etc.

哪些可以完全无上下文?

一些解析器生成器可以处理完整的上下文无关文法.想到了 Earley 和 CYK 解析器,但它们不是很快,所以人们倾向于避免使用它们.GLR 解析器可以做到这一点(我们在我们的工具中使用它,因为它确实有助于为真实语言编写语法 [参见我的简介],但有一些语法使它们变得非常缓慢;你基本上可以避免这些.显然 GLL 解析方案存在并且正在也完全无上下文;我希望它们也有一些钝语法的性能问题,但在实践中也非常有用.

A few parser generators can handle full-, context free grammars. Earley and CYK parsers come to mind, but they aren't very fast so people tend to avoid using them. GLR parsers can do this (we use this in our tools because it really aids writing grammars for real languages [see my bio] but there are some grammars that make them pretty slow; you can mostly avoid these. Apparently GLL parsing schemes exist and are also full context free; I'd expect them to have performance troubles with some obtuse grammars too, but also to be pretty usable in practice.

我听说过的唯一可以处理各种上下文相关语法的解析器生成器是 MetaS.我从未使用过它,但它背后的理论令人印象深刻.声称它可以执行任意上下文相关的语法;任意讨厌的语法会带来极高的成本,但这实际上并不反对.

The only parser generator I've heard of which can do a variety of context-sensitive grammars is MetaS. I've never used it, but the theory behind it is pretty impressive. The claim is that it can do arbitrary context-sensitive grammars; it will run into extremely high costs for arbitrarily nasty grammars, but that's actually not an objection in practice.

这篇关于我们可以用 ANTLR 定义一个非上下文无关文法吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆