为什么Parsec的sepBy停止并且不解析所有元素? [英] Why Parsec's sepBy stops and does not parse all elements?

查看:54
本文介绍了为什么Parsec的sepBy停止并且不解析所有元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解析一些逗号分隔的字符串,该字符串可能包含也可能不包含具有图像尺寸的字符串.例如"hello world,300x300,再见的世界" .

I am trying to parse some comma separated string which may or may not contain a string with image dimensions. For example "hello world, 300x300, good bye world".

我编写了以下小程序:

import Text.Parsec
import qualified Text.Parsec.Text as PS

parseTestString :: Text -> [Maybe (Int, Int)]
parseTestString s = case parse dimensStringParser "" s of
                      Left _ -> [Nothing]
                      Right dimens -> dimens

dimensStringParser :: PS.Parser [Maybe (Int, Int)]
dimensStringParser = (optionMaybe dimensParser) `sepBy` (char ',')

dimensParser :: PS.Parser (Int, Int)
dimensParser = do
  w <- many1 digit
  char 'x'
  h <- many1 digit
  return (read w, read h)

main :: IO ()
main = do
  print $ parseTestString "300x300,40x40,5x5"
  print $ parseTestString "300x300,hello,5x5,6x6"

根据 optionMaybe 文档,如果无法解析,它将返回 Nothing ,所以我希望得到以下输出:

According to optionMaybe documentation, it returns Nothing if it can't parse, so I would expect to get this output:

[Just (300,300),Just (40,40),Just (5,5)]
[Just (300,300),Nothing, Just (5,5), Just (6,6)]

但是我得到了:

[Just (300,300),Just (40,40),Just (5,5)]
[Just (300,300),Nothing]

即第一次失败后,解析将停止.所以我有两个问题:

I.e. parsing stops after first failure. So I have two questions:

  1. 为什么它会表现这种方式?
  2. 如何为这种情况编写正确的解析器?

推荐答案

当输入"hello,..." 输入时,我猜想 optionMaybe dimensParser ,尝试 dimensParser .失败了,因此 optionMaybe 返回成功,并显示 Nothing ,并且不占用输入的任何部分.

I'd guess that optionMaybe dimensParser, when fed with input "hello,...", tries dimensParser. That fails, so optionMaybe returns success with Nothing, and consumes no portion of the input.

最后一部分是关键部分:什么都没有返回后,仍要解析的输入字符串仍然是"hello,..." .

The last part is the crucial one: after Nothing is returned, the input string to be parsed is still "hello,...".

此时, sepBy 尝试解析 char',',但失败.因此,它推断出列表已结束,并终止了输出列表,而无需消耗更多的输入.

At that point sepBy tries to parse char ',', which fails. So, it deduces that the list is over, and terminates the output list, without consuming any more input.

如果要跳过其他实体,则需要一个消耗"解析器,该解析器返回 Nothing 而不是 optionMaybe .但是,该解析器需要知道要消耗多少:在您的情况下,直到逗号为止.

If you want to skip other entities, you need a "consuming" parser that returns Nothing instead of optionMaybe. That parser, however, need to know how much to consume: in your case, until the comma.

也许您需要一些(未品尝)

Perhaps you need some like (untested)

(   try (Just <$> dimensParser) 
<|> (noneOf "," >> return Nothing))
    `sepBy` char ','

这篇关于为什么Parsec的sepBy停止并且不解析所有元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆