为什么Parsec的sepBy停止并且不解析所有元素? [英] Why Parsec's sepBy stops and does not parse all elements?
问题描述
我正在尝试解析一些逗号分隔的字符串,该字符串可能包含也可能不包含具有图像尺寸的字符串.例如"hello world,300x300,再见的世界"
.
I am trying to parse some comma separated string which may or may not contain a string with image dimensions. For example "hello world, 300x300, good bye world"
.
我编写了以下小程序:
import Text.Parsec
import qualified Text.Parsec.Text as PS
parseTestString :: Text -> [Maybe (Int, Int)]
parseTestString s = case parse dimensStringParser "" s of
Left _ -> [Nothing]
Right dimens -> dimens
dimensStringParser :: PS.Parser [Maybe (Int, Int)]
dimensStringParser = (optionMaybe dimensParser) `sepBy` (char ',')
dimensParser :: PS.Parser (Int, Int)
dimensParser = do
w <- many1 digit
char 'x'
h <- many1 digit
return (read w, read h)
main :: IO ()
main = do
print $ parseTestString "300x300,40x40,5x5"
print $ parseTestString "300x300,hello,5x5,6x6"
根据 optionMaybe
文档,如果无法解析,它将返回 Nothing
,所以我希望得到以下输出:
According to optionMaybe
documentation, it returns Nothing
if it can't parse, so I would expect to get this output:
[Just (300,300),Just (40,40),Just (5,5)]
[Just (300,300),Nothing, Just (5,5), Just (6,6)]
但是我得到了:
[Just (300,300),Just (40,40),Just (5,5)]
[Just (300,300),Nothing]
即第一次失败后,解析将停止.所以我有两个问题:
I.e. parsing stops after first failure. So I have two questions:
- 为什么它会表现这种方式?
- 如何为这种情况编写正确的解析器?
推荐答案
当输入"hello,..."
输入时,我猜想 optionMaybe dimensParser
,尝试 dimensParser
.失败了,因此 optionMaybe
返回成功,并显示 Nothing
,并且不占用输入的任何部分.
I'd guess that optionMaybe dimensParser
, when fed with input "hello,..."
, tries dimensParser
. That fails, so optionMaybe
returns success with Nothing
, and consumes no portion of the input.
最后一部分是关键部分:什么都没有返回
后,仍要解析的输入字符串仍然是"hello,..."
.
The last part is the crucial one: after Nothing
is returned, the input string to be parsed is still "hello,..."
.
此时, sepBy
尝试解析 char','
,但失败.因此,它推断出列表已结束,并终止了输出列表,而无需消耗更多的输入.
At that point sepBy
tries to parse char ','
, which fails. So, it deduces that the list is over, and terminates the output list, without consuming any more input.
如果要跳过其他实体,则需要一个消耗"解析器,该解析器返回 Nothing
而不是 optionMaybe
.但是,该解析器需要知道要消耗多少:在您的情况下,直到逗号为止.
If you want to skip other entities, you need a "consuming" parser that returns Nothing
instead of optionMaybe
. That parser, however, need to know how much to consume: in your case, until the comma.
也许您需要一些(未品尝)
Perhaps you need some like (untested)
( try (Just <$> dimensParser)
<|> (noneOf "," >> return Nothing))
`sepBy` char ','
这篇关于为什么Parsec的sepBy停止并且不解析所有元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!