用于递归 bnf 的 Scala 解析器组合器技巧? [英] Scala Parser Combinators tricks for recursive bnf?

查看:90
本文介绍了用于递归 bnf 的 Scala 解析器组合器技巧?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试匹配此语法:

Im trying to match this syntax:

pgm ::= exprs
exprs ::= expr [; exprs]
expr ::= ID | expr . [0-9]+

我的 scala packrat 解析器组合器如下所示:

My scala packrat parser combinator looks like this:

import scala.util.parsing.combinator.PackratParsers
import scala.util.parsing.combinator.syntactical._

object Dotter extends StandardTokenParsers with PackratParsers {
    lexical.delimiters ++= List(".",";")
    def pgm = repsep(expr,";")
    def expr :Parser[Any]= ident | expr~"."~num
    def num = numericLit

       def parse(input: String) =
    phrase(pgm)(new PackratReader(new lexical.Scanner(input))) match {
      case Success(result, _) => println("Success!"); Some(result)
      case n @ _ => println(n);println("bla"); None
    }  

    def main(args: Array[String]) {
      val prg = "x.1.2.3;" +
            "y.4.1.1;" +
            "z;" +
            "n.1.10.30"


            parse(prg);
    }
}

但这行不通.要么匹配贪婪"并告诉我:

But this doesnt work. Either it "matches greedy" and tells me:

[1.2] failure: end of input expected 
x.1.2.3;y.4.1.1;z;n.1.10.30

或者如果我将 | 更改为 ||| 我会得到一个 stackoverflow:

or if I change the | to a ||| I get a stackoverflow:

Exception in thread "main" java.lang.StackOverflowError
at java.lang.Character.isLetter(Unknown Source)
at java.lang.Character.isLetter(Unknown Source)
at scala.util.parsing.combinator.lexical.Lexical$$anonfun$letter$1.apply(Lexical.scala:32)
at scala.util.parsing.combinator.lexical.Lexical$$anonfun$letter$1.apply(Lexical.scala:32)
...

我有点明白为什么会出现错误;我该怎么做才能解析像上面这样的语法?对我来说似乎并不深奥

I kindoff understand why I get the errors; what can I do to parse a syntax like the above? It doesnt seem that esoteric to me

基于 http://scala 中引用的论文-programming-language.1934581.n4.nabble.com/Packrat-parser-guidance-td1956908.html我发现我的程序实际上并没有使用新的 Packrat 解析器.

Based on the paper referenced in http://scala-programming-language.1934581.n4.nabble.com/Packrat-parser-guidance-td1956908.html I found out that my program didnt actually use the new packrat parser.

即.将 Parser[Any] 更改为 PackratParser[Any] 并使用 lazy val 而不是 def

Ie. change Parser[Any] to PackratParser[Any] and use lazy val instead of def

我将上面的内容改写为:

I rewrote the above to this:

import scala.util.parsing.combinator.PackratParsers
import scala.util.parsing.combinator.syntactical._

object Dotter extends StandardTokenParsers with PackratParsers {
    lexical.delimiters ++= List(".",";")
    lazy val pgm : PackratParser[Any] = repsep(expr,";")
    lazy val expr :PackratParser[Any]= expr~"."~num | ident
    lazy val num = numericLit

    def parse(input: String) =
    phrase(pgm)(new PackratReader(new lexical.Scanner(input))) match {
      case Success(result, _) => println("Success!"); Some(result)
      case n @ _ => println(n);println("bla"); None
    }  

    def main(args: Array[String]) {
      val prg = "x.1.2.3 ;" +
            "y.4.1.1;" +
            "z;" +
            "n.1.10.30"


            parse(prg);
    }
}

推荐答案

问题是(至少部分)您实际上并没有使用 Packrat 解析器.请参阅 Scala 的文档 PackratParsers 特性,表示

The problem is (at least partially) that you're not actually using Packrat parsers. See the documentation for Scala's PackratParsers trait, which says

使用 PackratParsers 非常相似使用解析器:

Using PackratParsers is very similar to using Parsers:

  • 任何扩展解析器的类/特征(直接或通过子类)可以混合在 PackratParsers 中.示例:对象 MyGrammar 扩展标准令牌解析器PackratParsers
  • 之前声明为 def 的每个语法产生式没有形式参数变成惰性值,它的类型从Parser[Elem] 到 PackratParser[Elem].因此,例如,def生产:Parser[Int] = {...} 变成惰性 val生产:PackratParser[Int] = {...}
  • 重要提示:使用 PackratParsers 不是一个全有或全无的决定.它们可以与常规的自由混合单一语法中的解析器.

我对 Scala 2.8 的解析器组合器的了解还不够彻底解决这个问题,但是通过以下修改,我能够让它解析到分号,这比你已经有了改进完成.

I don't know enough about Scala 2.8's parser combinators to fix this entirely, but with the following modifications, I was able to get it to parse as far as the semicolon, which is an improvement over what you've accomplished.

object Dotter extends StandardTokenParsers with PackratParsers {
    lexical.delimiters ++= List(".",";")
    lazy val pgm:PackratParser[Any] = repsep(expr,";")
    lazy val expr:PackratParser[Any]= ident ||| (expr~"."~numericLit)

    def parse(input: String) = phrase(expr)(lex(input)) match {
      case Success(result, _) => println("Success!"); Some(result)
      case n @ _ => println(n);println("bla"); None
    }  

    def lex(input:String) = new PackratReader(new lexical.Scanner(input))
}

这篇关于用于递归 bnf 的 Scala 解析器组合器技巧?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆