如何进一步改进基于Scala解析器组合器的解析器中的错误消息? [英] How to further improve error messages in Scala parser-combinator based parsers?

查看:115
本文介绍了如何进一步改进基于Scala解析器组合器的解析器中的错误消息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经编写了基于Scala解析器组合器的解析器:

  class SxmlParser使用PackratParsers扩展具有ImplicitConversions的RegexParsers {
[...]
lazy val文件:PackratParser [AstNodeDocument] =
((procinst | element | comment | cdata |空格|文本)*)^^ {
AstNodeDocument )
}
[...]
}
对象SxmlParser {
def parse(text:String):AstNodeDocument = {
var ast = AstNodeDocument ()
val parser = new SxmlParser()
val result = parser.parseAll(parser.document,new CharArrayReader(text.toArray))
result match {
case parser。 Success(x,_)=> ast = x
case parser.NoSuccess(err,next)=> {
tool.die(无法解析SXML输入+
(行+ next.pos.line +,列+ next.pos.column +):\\\
+
err +\\\
+
next.pos.longString)
}
}
ast
}
}

通常生成的解析错误消息是相当不错的。但是有时它变成只是

  sxml:ERROR:无法解析SXML输入(第32行第1列):
``'预期但``发现
^

如果报价字符不是关闭并且解析器到达EOT,我想看到的是(1)当分析器预期(我有多个)和(2)在输入中这个生产开始时,解析器的生产是什么解析(这是开头报价在输入中的指标)。有没有人知道如何改进错误消息,并在错误发生时包含有关实际内部解析状态的更多信息(可能类似于生产规则堆栈跟踪或者可以在这里合理地给出以更好地识别错误位置)。 BTW,上面的第32行第1列实际上是EOT位置,因此当然不用。

解决方案

我还不知道如何处理(1),但是当我发现这个网页时,我也在寻找(2):



https://wiki.scala-lang.org/plugins/viewsource/viewpagesrc.action ?pageId = 917624



我只是复制信息:


有用的增强功能是记录重要令牌的输入位置(行号和列号)。为此,您必须做三件事:




  • 使每个输出类型扩展为scala.util.parsing.input.Positional

  • 调用Parsers.positioned()组合器

  • 使用记录行和列位置的文本源



最后,确保源跟踪位置。对于流,您可以简单地使用scala.util.parsing.input.StreamReader;对于Strings,请使用scala.util.parsing.input.CharArrayReader。


我正在玩它,所以我会尝试稍后再添加一个简单的例子


I've coded a parser based on Scala parser combinators:

class SxmlParser extends RegexParsers with ImplicitConversions with PackratParsers {
    [...]
    lazy val document: PackratParser[AstNodeDocument] =
        ((procinst | element | comment | cdata | whitespace | text)*) ^^ {
            AstNodeDocument(_)
        }
    [...]
}
object SxmlParser {
    def parse(text: String): AstNodeDocument = {
        var ast = AstNodeDocument()
        val parser = new SxmlParser()
        val result = parser.parseAll(parser.document, new CharArrayReader(text.toArray))
        result match {
            case parser.Success(x, _) => ast = x
            case parser.NoSuccess(err, next) => {
                tool.die("failed to parse SXML input " +
                    "(line " + next.pos.line + ", column " + next.pos.column + "):\n" +
                    err + "\n" +
                    next.pos.longString)
            }
        }
        ast
    }
}

Usually the resulting parsing error messages are rather nice. But sometimes it becomes just

sxml: ERROR: failed to parse SXML input (line 32, column 1):
`"' expected but `' found
^

This happens if a quote characters is not closed and the parser reaches the EOT. What I would like to see here is (1) what production the parser was in when it expected the '"' (I've multiple ones) and (2) where in the input this production started parsing (which is an indicator where the opening quote is in the input). Does anybody know how I can improve the error messages and include more information about the actual internal parsing state when the error happens (perhaps something like a production rule stacktrace or whatever can be given reasonably here to better identify the error location). BTW, the above "line 32, column 1" is actually the EOT position and hence of no use here, of course.

解决方案

I don't know yet how to deal with (1), but I was also looking for (2) when I found this webpage:

https://wiki.scala-lang.org/plugins/viewsource/viewpagesrc.action?pageId=917624

I'm just copying the information:

A useful enhancement is to record the input position (line number and column number) of the significant tokens. To do this, you must do three things:

  • Make each output type extend scala.util.parsing.input.Positional
  • invoke the Parsers.positioned() combinator
  • Use a text source that records line and column positions

and

Finally, ensure that the source tracks positions. For streams, you can simply use scala.util.parsing.input.StreamReader; for Strings, use scala.util.parsing.input.CharArrayReader.

I'm currently playing with it so I'll try to add a simple example later

这篇关于如何进一步改进基于Scala解析器组合器的解析器中的错误消息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆