Scala解析器有时会跳过空白,有时不会 [英] Scala Parser that sometimes skips whitespace and sometimes does not
问题描述
我有一个运行中的Scala解析器,但是解决方案并不像我所希望的那样干净.问题在于某些产品必须将空白视为令牌的一部分,但更高级别"的产品应能够忽略/跳过空白.
如果我使用扩展低级解析器的典型scala解析器模式,则将继承skipWhitespace设置,并且事情很快就会变得混乱.
我认为最好不要使用扩展方法,而应该在较高级别的解析器的类中使用低级别解析器的实例-但我不确定如何使之工作,实例只能看到一个输入字符流.
这是最低层解析器的一部分-
class VulgarFractionParser extends RegexParsers {
override type Elem = Char
override val whiteSpace = "".r
然后我像这样扩展
class NumberParser extends VulgarFractionParser with Positional {
但是在这一点上,NumberParser必须像FractionParser一样显式地处理空白.对于NumberParser来说,它仍然非常易于管理-但在下一个更高的层次上,我真的希望能够仅定义 do 使用空格作为分隔符的生成,就像普通的regexParser一样. >
例如:
IBM 33.33/ 1200.00
or
IBM 33.33/33.50 1200.00
第二个值有时有两部分,中间用"/"分隔,有时只有一个部分,斜杠后没有任何内容(甚至根本不包含斜杠).
def bidOrAskPrice = ("$"?) ~> (bidOrAskPrice1 | bidOrAskPrice2 | bidOrAskPrice3)
def bidOrAskPrice1 = number ~ ("/".r) ~ number ~ (SPACES) ^^ {
case a ~ slash ~ b ~ sp1 => BidOrAsk(a,Some(b))
}
def bidOrAskPrice2 = (number ~ "/" ~ (SPACES)) ^^ { case a ~ slash ~ sp => BidOrAsk(a,None) }
def bidOrAskPrice3 = (number ~ (SPACES?)) ^^ { case a ~ sp => BidOrAsk(a , None)}
将第一个解析器转换为令牌解析器(实际上是词法分析器),并让第二个解析器读取而不是简单地有意义吗? Char
?
I've got a working Scala parser but the solution is not as clean as I would like. The problem is that some of the productions must consider whitespace as part of the token but the "higher-level" productions should be able to ignore/skip the whitespace.
If I use the typical scala parser pattern of extending the lower level parsers then the skipWhitespace settings are inherited and things get messy very quickly.
I think I would be better off not using the extends approach but rather have an instance of the low level parser available in the higher level parsers' class -- but I'm not sure how to make that work, such that each instance would see only one stream of input characters.
Here is part of the lowest-level parser -
class VulgarFractionParser extends RegexParsers {
override type Elem = Char
override val whiteSpace = "".r
Then I extend that like
class NumberParser extends VulgarFractionParser with Positional {
But at this point the NumberParser must explicitly handle whitespace just like the FractionParser. For the NumberParser it is still pretty manageable - but at the next level up I really want to be able to just define productions that do use whitespace as a separator just like a normal regexParser would do.
An example would be something like:
IBM 33.33/ 1200.00
or
IBM 33.33/33.50 1200.00
The 2nd value sometimes has two parts separated by a "/" and sometimes only has a single part with nothing after the slash (or even not containing a slash at all).
def bidOrAskPrice = ("$"?) ~> (bidOrAskPrice1 | bidOrAskPrice2 | bidOrAskPrice3)
def bidOrAskPrice1 = number ~ ("/".r) ~ number ~ (SPACES) ^^ {
case a ~ slash ~ b ~ sp1 => BidOrAsk(a,Some(b))
}
def bidOrAskPrice2 = (number ~ "/" ~ (SPACES)) ^^ { case a ~ slash ~ sp => BidOrAsk(a,None) }
def bidOrAskPrice3 = (number ~ (SPACES?)) ^^ { case a ~ sp => BidOrAsk(a , None)}
Doesn't it make more sense to turn the first parser into a token parser (a lexer, really), and make the second parser read that instead of plain Char
?
这篇关于Scala解析器有时会跳过空白,有时不会的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!