Haskell Parsec组合器'many'应用于接受空字符串的解析器 [英] Haskell Parsec combinator 'many' is applied to a parser that accepts an empty string

查看:192
本文介绍了Haskell Parsec组合器'many'应用于接受空字符串的解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

import Text.ParserCombinators.Parsec

delimiter :: Parser ()
delimiter = do char '|'
               return ()
          <?> "delimiter"


eol :: Parser ()
eol = do oneOf "\n\r"
         return ()
    <?> "end of line"

item :: Parser String
item = do entry <- manyTill anyChar (try eol <|> try delimiter <|> eof)
          return entry

items :: Parser [String]
items = do result <- many item
           return result

当我使用上面的代码运行 parseTest项目a | b | c时,我得到以下错误:

When I run parseTest items "a|b|c" with the code above I get the following error:

*** Exception: Text.ParserCombinators.Parsec.Prim.many: 
combinator 'many' is applied to a parser that accepts an empty string.

我相信它与 eof 和多项,如果我删除 eof ,那么只要行不行,我就可以使它工作以 eof 结尾,这使得它没用。

I believe it has something to do with eof and many item, if I remove eof, then I can get it to work as long as the line does not end in eof, which makes it kind of useless.

我意识到我可以使用 sepBy 但我感兴趣的是为什么这段代码不工作,以及如何使它工作。

I realize I could just use sepBy but what I am interested in is why this code does not work and how to make it work.

推荐答案

类似于 many 的解析器实际上不能应用于接受空字符串的解析器,因为这会使语法不明确:您多频繁地解析空字符串?选择不同的数字可能会导致不同的解析结果...

A parser like many can indeed not be applied to parsers that accept the empty string, because this makes the grammar ambiguous: How often do you parse the empty string? Choosing different numbers can lead to different parse results ...

您假设多项是有问题的组合。 项目是根据 manyTill 定义的。 (Excursion:顺便说一句,你可以简化 manyTill

You are right to assume that many item is the problematic combination. An item is defined in terms of manyTill. (Excursion: Btw, you can simplify manyTill to

item :: Parser String
item = manyTill anyChar (eol <|> delimiter <|> eof)

不需要 do return ,并且不需要尝试因为三个解析器
中的每一个都需要不同的第一个标记。)解析器 manyTill 因此解析任意数量的字符,然后是可以是 eol ,a 分隔符 eof 。现在, eol 分隔符在成功时至少会实现一个字符,但 eof
不。解析器 eof 在输入结束时成功,但可以多次应用。例如,

No need for the do or the return, and no need for try, because each of the three parsers expect different first tokens.) The parser manyTill thus parses an arbitrary number of characters, followed by either an eol, a delimiter, or an eof. Now, eol and delimiter actually consume at least one character when they succeed, but eof doesn't. The parser eof succeeds at the end of the input, but it can be applied multiple times. For example,

ghci> parseTest (do { eof; eof }) ""
()

消费任何输入,从而使得 item 可以在空字符串上成功(在输入结束时),从而引起歧义。

It doesn't consume any input, and is thereby making it possible for item to succeed on the empty string (at the end of your input), and is thereby causing the ambiguity.

为了解决这个问题,你确实可以重写你的语法并转移到像 sepBy 之类的东西,或者你可以尝试区分正常的<从最终的项目项目 s(其中 eof 不允许作为结束标记) / code>(其中 eof 是允许的)。

To fix this, you can indeed rewrite your grammar and move to something like sepBy, or you can try to distinguish normal items (where eof isn't allowed as end-marker) from the final item (where eof is allowed).

这篇关于Haskell Parsec组合器'many'应用于接受空字符串的解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆