Ebnf –这是LL(1)语法吗? [英] Ebnf – Is this an LL(1) grammar?

查看：284 发布时间：2020/5/25 1:27:58 parsing ebnf ll

本文介绍了Ebnf –这是LL(1)语法吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在维基百科上找到以下 EBNF ，描述了EBNF:

I found the following EBNF on wikipedia, describing EBNF:

letter = "A" | "B" | "C" | "D" | "E" | "F" | "G"
   | "H" | "I" | "J" | "K" | "L" | "M" | "N"
   | "O" | "P" | "Q" | "R" | "S" | "T" | "U"
   | "V" | "W" | "X" | "Y" | "Z" ;
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
symbol = "[" | "]" | "{" | "}" | "(" | ")" | "<" | ">"
   | "'" | '"' | "=" | "|" | "." | "," | ";" ;
character = letter | digit | symbol | "_" ;

identifier = letter , { letter | digit | "_" } ;
terminal = "'" , character , { character } , "'" 
     | '"' , character , { character } , '"' ;

lhs = identifier ;
rhs = identifier
 | terminal
 | "[" , rhs , "]"
 | "{" , rhs , "}"
 | "(" , rhs , ")"
 | rhs , "|" , rhs
 | rhs , "," , rhs ;

rule = lhs , "=" , rhs , ";" ;
grammar = { rule } ;

现在，由于我对解析器和语法的了解有限，因此我不知道这是否是LL(1)语法.我试图为此编写一个解析器，但是在尝试读取rhs时会失败，后者会重新读取自身，然后再次读取自身，哦，您明白了...

Now, because of my limited knowledge on parsers and grammars, I don't know if this is an LL(1) grammar. I tried to write a parser for it, but it fails when trying to read rhs, which reads itself again, which reads itself again, which oh, you got it...

这是LL(1)语法吗?
如果没有，如何将其变成一体(可能?)?

推荐答案

引用的维基百科摘录不是EBNF的正确EBNF语法.它也不是左解析的:的确，它是模棱两可的，因此根本不是绝对可解析的.

The quoted Wikipedia extract is not a correct EBNF grammar for EBNF. It's also not left-parseable: indeed, it is ambiguous, so it's not unambiguously parseable at all.

通常，术语LL(k)和LR(k)(以及许多其他此类术语)适用于上下文无关文法(CFG)(以及扩展地，这些语法生成的语言) ). EBNF并不是描述CFG的形式主义.它被设计为描述上下文无关语言的形式主义，因此应该可以从给定的EBNF语法创建CFG(但请参见注释1)，但是EBNF语法规则与单个语法之间没有直接对应关系.在CFG中进行生产.

In general, the terms LL(k) and LR(k) (and many other such terms) apply to Context-Free Grammars (CFGs) (and, by extension, the languages generated by those grammars). EBNF is not a formalism for describing CFGs. It is designed to be a formalism to describe context-free languages and therefore it should be possible to create a CFG from a given EBNF grammar (but see Note 1), but there is not a direct correspondence between an EBNF syntax rule and a single production in a CFG.

也就是说，通常可以使用一些标准转换直接创建CFG.例如:

That said, you can usually directly create a CFG by using some standard transformations. For example:

{ ... }

可以用生成的非末端M''替换，并加上以下内容:(ε是空字符串)

can be substituted with the generated non-terminal M'', with the addition of the following productions: (ε is the empty string)

M'  → ...
M'' → ε
M'' → M' M''

上述变换不会引入左递归，因此不会人为地使原始语法变为非LL(1).

The above transformation does not introduce left-recursion, so it does not artificially make the original grammar non-LL(1).

引文语法中最重要的错误[注2]是不明确的EBNF规则:

The most important error in the cited grammar [Note 2] is the ambigous EBNF rule:

rhs = identifier
    | terminal
    | "[" , rhs , "]"
    | "{" , rhs , "}"
    | "(" , rhs , ")"
    | rhs , "|" , rhs
    | rhs , "," , rhs
    ;

它也是左递归的，因此它将不对应于LL(1)CFG.但更重要的是，它不表示|和,运算符的关联性或优先级. (从语义上讲，这些运算符没有定义的关联性，但是语法仍应指定一个；否则，就不可能明确创建一个解析树.这两个运算符之间的优先级在语义上很重要.)

It's also left-recursive, so it will not correspond to an LL(1) CFG. But more importantly, it does not indicate either the associativity or the precedence of the | and , operators. (Semantically, these operators do not have a defined associativity, but the syntax should still specify one; otherwise, it is impossible to unambiguously create a parse tree. The precedence between the two operators is important semantically.)

一组更好的规则是:

primary = identifier
        | terminal
        | "[" , rhs , "]"
        | "{" , rhs , "}"
        | "(" , rhs , ")"
        ;
factor  = primary , { "|" , primary } ;
rhs     = factor ,  { "," , factor } ;

这仍然是一个过分的简化，但是它涵盖了许多用例.它既不是模棱两可的，也不是左递归的. [3]

This is still an oversimplification, but it covers a large number of use cases. It's neither ambiguous nor left-recursive. [3]

注释

syntactic exception =

   ? a syntactic-factor that could be replaced

     by a syntactic-factor containing no

     meta-identifiers

   ?

定义

Syntactic constraints specified in comments may not be easy to translate into CFGs, though. For example, the ISO standard EBNF for EBNF defines the non-terminal "syntactic exception" as follows:

syntactic exception = ? a syntactic-factor that could be replaced by a syntactic-factor containing no meta-identifiers ?

The intention of the above text is to restrict an exception to be a regular language. That's important, since the set difference between two context-free languages is not necessarily context-free, while the difference between a context-free language and a regular language is provably context-free. Ironically, the "special sequence" describing this restriction cannot be expressed as a context-free grammar because it depends on the definition of meta-identifiers. (If it had said "a syntactic-factor containing no meta-identifiers" it would be easy to write without resorting to a special sequence, but clearly that was not the intent.)

Wikipedia摘录中还有另一个重要错误.它将两种类型的带引号的字符串定义为具有相同的主体，但这是不正确的.双引号字符串不能包含双引号字符，单引号字符串不能包含单引号字符.因此，在这两个定义中都使用标识符character是不正确的.

There is another important error in the Wikipedia excerpt. It defines both types of quoted strings as having the same body, but that's not correct; a double-quoted string cannot include double-quote characters, and a single-quoted string cannot contain single-quote characters. So the use of the identifier character in both of those definitions is incorrect.

正式的EBNF语法允许primary为空.我省去了，因为通常不需要.

The formal EBNF grammar allows primary to be empty. I left that out, because it's not usually needed.

这篇关于Ebnf –这是LL(1)语法吗?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Ebnf –这是LL(1)语法吗? [英] Ebnf – Is this an LL(1) grammar?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Ebnf –这是LL(1)语法吗? [英] Ebnf – Is this an LL(1) grammar?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭