Scala解析器问题 [英] Scala Parser Issues

查看:131
本文介绍了Scala解析器问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在测试用于简单Book DSL的Scala Parser Combinator功能时遇到问题.

I am having issues testing out the Scala Parser Combinator functionality for a simple Book DSL.

首先有一个图书课:

case class Book (name:String,isbn:String) {
def getNiceName():String = name+" : "+isbn
}

接下来,有一个简单的解析器:

Next, there is the simple parser:

object BookParser extends StandardTokenParsers {
  lexical.reserved += ("book","has","isbn")

  def bookSpec  = "book" ~> stringLit ~> "has" ~> "isbn" ~> stringLit ^^ {
            case "book" ~ name ~ "has" ~ "isbn" ~ isbn => new Book(name,isbn) }

  def parse (s: String) = {
    val tokens = new lexical.Scanner(s)
    phrase(bookSpec)(tokens)
  }

  def test (exprString : String) = {
     parse (exprString) match {
         case Success(book) => println("Book"+book.getNiceName())
     }
  }

  def main (args: Array[String]) = {
     test ("book ABC has isbn DEF")
  }   
}

我在编译该错误时遇到了一系列错误-在尝试解构Internet上的其他示例时,这对我来说似乎很奇怪.例如,bookSpec函数看起来与其他示例几乎相同?

I'm getting a range of errors trying to compile this - some which seem a strange to me when trying to deconstruct the other examples on the internet. For example, the bookSpec function appears nearly identical to the other examples?

这是构建像这样的简单解析器的最佳方法吗?

Is this the best way to build a simple parser like this?

谢谢

推荐答案

您处在正确的轨道上.解析器中存在一些问题.我将发布更正的代码,然后解释更改.

You're on the right track. There are a few issues in your parser. I'll post the corrected code, then explain the changes.

import scala.util.parsing.combinator._
import scala.util.parsing.combinator.syntactical._

case class Book (name: String, isbn: String) {
  def niceName = name + " : " + isbn
}


object BookParser extends StandardTokenParsers {
  lexical.reserved += ("book","has","isbn")

  def bookSpec: Parser[Book]  = "book" ~ ident ~ "has" ~ "isbn" ~ ident ^^ {
            case "book" ~ name ~ "has" ~ "isbn" ~ isbn => new Book(name, isbn) }

  def parse (s: String) = {
    val tokens = new lexical.Scanner(s)
    phrase(bookSpec)(tokens)
  }

  def test (exprString : String) = {
     parse (exprString) match {
       case Success(book, _) => println("Book: " + book.niceName)
       case Failure(msg, _) => println("Failure: " + msg)
       case Error(msg, _) => println("Error: " + msg)
     }
  }

  def main (args: Array[String]) = {
     test ("book ABC has isbn DEF")
  }   
}

1.解析器的返回值

为了从解析器返回一本书,您需要为类型推断器提供一些帮助.我将bookSpec函数的定义更改为明确的:它返回一个Parser [Book].也就是说,它返回一个对象,该对象是书籍的解析器.

In order to return a book from a parser, you need to give the type inferencer some help. I changed the definition of the bookSpec function to be explicit: it returns a Parser[Book]. That is, it returns an object which is a parser for books.

2. stringLit

您使用的stringLit函数来自StdTokenParsers特性. stringLit是一个返回Parser [String]的函数,但是它匹配的模式包括大多数语言用来分隔字符串文字的双引号.如果您对DSL中的双引号感到满意,那么stringLit就是您想要的.为了简单起见,我用ident替换了stringLit. ident查找Java语言标识符.对于ISBN来说,这不是真正正确的格式,但是它确实通过了您的测试案例. :-)

The stringLit function you used comes from the StdTokenParsers trait. stringLit is a function that returns Parser[String], but the pattern it matches includes the double-quotes that most languages use to delimit a string literal. If you are happy with double-quoting words in your DSL, then stringLit is what you want. In the interest of simplicity, I replaced stringLit with ident. ident looks for a Java-language identifier. This isn't really the right format for ISBNs, but it did pass your test case. :-)

要正确匹配ISBN,我认为您需要使用正则表达式而不是标识符.

To match ISBNs correctly, I think you'll need to use a regex expression instead of idents.

3.忽略左序列

您的匹配器使用了一串〜>组合器.该函数需要两个Parser [_]对象,并返回一个顺序识别两个Parser的解析器,然后返回右侧的结果.通过使用它们的整个链来产生最终的stringLit,解析器将忽略句子中除最后一个单词以外的所有内容.那意味着它也会丢掉书名.

Your matcher used a string of ~> combiners. This is a function that takes two Parser[_] objects and returns a Parser that recognizes both in sequence, then returns the result of the right hand side. By using a whole chain of them to lead up to your final stringLit, your parser would ignore everything except the final word in the sentence. That means it would throw away the book name, too.

此外,当您使用〜>或<〜时,被忽略的标记不应出现在您的模式匹配中.

Also, when you use ~> or <~, the ignored tokens should not appear in your pattern matching.

为简单起见,我将所有这些都更改为简单的序列函数,并在模式匹配中保留了额外的标记.

For simplicity, I changed these all to simple sequence functions and left the extra tokens in the pattern match.

4.匹配结果

测试方法需要匹配parse()函数的所有可能结果.因此,我添加了Failure()和Error()案例.另外,即使成功,也包括您的返回值和读取器对象.我们不在乎读者,因此我只在模式匹配中使用"_"忽略了它.

The test method needs to match all the possible results from the parse() function. So, I added the Failure() and Error() cases. Also, even Success includes both your return value and the Reader object. We don't care about the reader, so I just used "_" to ignore it in the pattern match.

希望这会有所帮助!

这篇关于Scala解析器问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆