Parsec如何找到“匹配”在一个字符串内 [英] Parsec how to find "matches" within a string

查看:74
本文介绍了Parsec如何找到“匹配”在一个字符串内的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用parsec解析字符串中的所有匹配输入并丢弃其余部分?



示例:我有一个简单的数字解析器,如果知道分隔符是什么,我可以找到所有数字:

  num :: Parser Int 
num = read< $>许多数字

parse(num`sepBy` space)111 4 22


$ b $但是,如果我不知道这些数字之间是什么?

 如果我每周工作4天,我将活到111岁< b>< / b>在22. 

许多anyChar 不起作用分隔符,因为它消耗了所有东西。



那么我怎么能找到与我想忽略的东西相匹配的任意解析器匹配的东西呢?






编辑:请注意,在真正的问题中,我的解析器比较复杂:

  optionTag :: Parser Fragment 
optionTag = do
string< option
manyTill anyChar(stringvalue =)
n < - many1 digit
manyTill anyChar(char'>')
chapterPrefix
text< - many1(noneOf<)
return $ Option (read n)text
where
chapterPrefix = many digit>> char'。'>>许多空间


解决方案

对于任意解析器 myParser ,这很容易:

  solution = many(let one = myParser< |> ;(anyChar>> one)in one)

这样写就更清楚了:

  solution = many loop 
where
loop = myParser< |> (anyChar>>循环)

实际上,它定义了一个递归解析器(称为 loop ),它将继续搜索可由 myParser 解析的第一个东西。 许多将会彻底搜寻直到失败,即:EOF。


How can I use parsec to parse all matched input in a string and discard the rest?

Example: I have a simple number parser, and I can find all the numbers if I know what separates them:

num :: Parser Int
num = read <$> many digit

parse (num `sepBy` space) "" "111 4 22"

But what if I don't know what is between the numbers?

"I will live to be 111 years <b>old</b> if I work out 4 days a week starting at 22."

many anyChar doesn't work as a separator, because it consumes everything.

So how can I get things that match an arbitrary parser surrounded by things I want to ignore?


EDIT: Note that in the real problem, my parser is more complicated:

optionTag :: Parser Fragment
optionTag = do
    string "<option"
    manyTill anyChar (string "value=")
    n <- many1 digit
    manyTill anyChar (char '>')
    chapterPrefix
    text <- many1 (noneOf "<>")
    return $ Option (read n) text
  where
    chapterPrefix = many digit >> char '.' >> many space

解决方案

For an arbitrary parser myParser, it's quite easy:

solution = many (let one = myParser <|> (anyChar >> one) in one)

It might be clearer to write it this way:

solution = many loop
    where 
        loop = myParser <|> (anyChar >> loop)

Essentially, this defines a recursive parser (called loop) that will continue searching for the first thing that can be parsed by myParser. many will simply search exhaustively until failure, ie: EOF.

这篇关于Parsec如何找到“匹配”在一个字符串内的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆