如何让Attoparsec解析器在不消耗的情况下成功（如parsec lookAhead） [英] How do I make Attoparsec parser succeed without consuming (like parsec lookAhead)

查看：153 发布时间：2018/6/4 17:07:16 parsing haskell attoparsec

本文介绍了如何让Attoparsec解析器在不消耗的情况下成功（如parsec lookAhead）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我编写了一个快速的attoparsec解析器来遍历一个aspx文件并删除所有的样式属性，并且它的工作正常，除了其中的一部分，我无法弄清楚如何使它匹配>

以下是我的：

  anyTill = manyTill anyChar 
 anyBetween start end = start *> anyTill结束
 
 styleWithQuotes = anyBetween（stringCIstyle = \）（stringCI\）
 styleWithoutQuotes = anyBetween（stringCIstyle =）（stringCI< ; |>>）
 everythingButStyles = manyTill anyChar（styleWithQuotes< |> styleWithoutQuotes）< |> many1 anyChar

我知道这部分是因为我在everythingButStyles中使用manyTill，这就是我主动删除所有样式的东西，但在 styleWithoutQuotes 我需要它匹配>作为结束，但不消耗它，在parsec中，我会刚刚完成 lookAhead>但我无法在attoparsec中做到这一点。

解决方案

同时， lookAhead combinator已添加到 attoparsec ，所以现在可以使用 lookAhead（char'>'）或 lookAhead（string>）以实现目标。

以下是引入之前的解决方法。

你可以建立你的号码耗时的解析器，使用 peekWord8 ，它只是查看下一个字节（如果有的话）。由于 ByteString 有一个 Monoid 实例， Parser ByteString 是一个 MonadPlus ，您可以使用

lookGreater = do mbw< - peekWord8 case mbw of 只需62 - >返回> _ - > mzero
（62是'>'）找到'>'而不消耗它或失败。

I wrote a quick attoparsec parser to walk an aspx file and drop all the style attributes, and it's working fine except for one piece of it where I can't figure out how to make it succeed on matching > without consuming it.

Here's what I have:
anyTill = manyTill anyChar anyBetween start end = start *> anyTill end styleWithQuotes = anyBetween (stringCI "style=\"") (stringCI "\"") styleWithoutQuotes = anyBetween (stringCI "style=") (stringCI " " <|> ">") everythingButStyles = manyTill anyChar (styleWithQuotes <|> styleWithoutQuotes) <|> many1 anyChar
I understand it's partially because of how I'm using manyTill in everythingButStyles, that's how I am actively dropping all the styles stuff on the ground, but in styleWithoutQuotes I need it to match ">" as an end, but not consume it, in parsec I would have just done lookAhead ">" but I can't do that in attoparsec.
解决方案
Meanwhile, the lookAhead combinator was added to attoparsec, so now one can just use lookAhead (char '>') or lookAhead (string ">") to achieve the goal.

Below is a workaround from the times before its introduction.

You can build your non-consuming parser using peekWord8, which just looks at the next byte (if any). Since ByteString has a Monoid instance, Parser ByteString is a MonadPlus, and you can use
lookGreater = do mbw <- peekWord8 case mbw of Just 62 -> return ">" _ -> mzero
(62 is the code point of '>') to either find a '>' without consuming it or fail.

这篇关于如何让Attoparsec解析器在不消耗的情况下成功（如parsec lookAhead）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何让Attoparsec解析器在不消耗的情况下成功（如parsec lookAhead） [英] How do I make Attoparsec parser succeed without consuming (like parsec lookAhead)

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何让Attoparsec解析器在不消耗的情况下成功（如parsec lookAhead） [英] How do I make Attoparsec parser succeed without consuming (like parsec lookAhead)

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭