Megaparsec:无法解析算术字符串 [英] Megaparsec: Not able to parse arithmetic string

查看:107
本文介绍了Megaparsec:无法解析算术字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

   - 算术表达式
数据Aexp = N Num
| V Var
| Mult Aexp Aexp
|添加Aexp Aexp
| Sub Aexp Aexp
deriving(Show,Eq,Read)


arithParser :: Parser Aexp
arithParser = V< $> strParser
< |> N< $> numParser
< |> Mult< $> arithParser< * tok*< *> arithParser
--boolParser :: Parser Bexp


strParser :: Parser Var
strParser = tok\*> some(noneOf(\\ \\ n \r\= [] {},:))< * tok\

numParser :: Parser Num
numParser =(some(oneOf读取)< *空格



['0'..'9'])>> = return。

如果我运行命令 Parse arithParser5 * 55 * 5,它就会返回 Right(N 5),它应该返回 Mult(N 5)(N 5),因为在arithParser中有优先权,但是如果我改变顺序,那么它似乎进入一个无限循环并崩溃。



不知道我在这里做错了什么,任何帮助将不胜感激。

解决方案

Parsec在尝试正确的替代方法之前尝试< |> 那么它就不会打扰正确的,所以在这个例子中,当输入 5 * 5 时, Parsec的过程如下所示:


  1. 尝试 V <$> strParser strParser tok\开头,但输入字符串不以''' so strParser 失败。

  2. 尝试 N <$ > numParser numParser 成功分析数字5,所以Parsec只返回 N 5

  3. 完成!不需要尝试第三种选择。






因此,我们可以尝试修补此解析器通过将 Mult 选项移动到顶部,并将其包装在 try 中,以便它可以回溯并尝试 numParser strParser 如果输入结果不是乘法。

  arithParser :: Parser Aexp 
arithParser = try(Mult< $> arithParser< * tok*<> arithParser)
< |> N< $> numParser
< |> V< $> strParser

这个解析器有另一个更微妙的问题。


  1. 尝试 try(Mult< $> arithParser< * tok** arithParser)。这个解析器以 arithParser 开头,所以递归调用 arithParser

  2. code> try(Mult< $> arithParser< * tok*<> arithParser)< $ code> ;.这个解析器以 arithParser 开头,所以递归调用 arithParser

  3. code> try(Mult< $> arithParser< * tok*<> arithParser)< $ code> ;.这个解析器以 arithParser 开始,所以递归调用 arithParser

  4. .. 。

这是一个无限循环。 Parsec无法处理左递归语法。您必须设计解析器,以便在递归调用之前至少使用一个令牌。这样做的一种常见方式是平掉语法:

  expr,term :: Parser AExp 
expr = do
n< - term
rest< - 可选$ tok**> expr
return $ maybe n(Mult n)rest
term = N< $> numParser
< |> V< $> strParser
< |> ('')(char')')


$ b $括号内的表达式

括号内= b

在这里,我将解析器分解为一个解析器,它可以选择性地解析任意 expr - a term 乘以一个乘法符号和一个被乘数 expr - 以及解析单个 term s - 数字,字符串和括号表达式。对 expr 的递归调用现在可以了 - expr 中的一个只在您解析了 term (总是消耗输入),而 term 中的一个只在你解析了一个左括号后才会发生。



请注意, expr 具有类似列表的结构:它解析可能跟随许多事情的单个事物。一般而言,您应该考虑使用线性输入流标记的解析器,因此列表形式的解析器往往比树形解析器更有效并不奇怪。



文字。 Megaparsec.Expr 模块包含封装这个模式和解析具有任意优先级和固定规则表达式的函数。

  expr = makeExprParser term [[InfixR $ tok*$> Mult]] 


I'm working on a small parser using Megaparsec and trying to parse arithmetic.

-- Arithmetic expressions
data Aexp = N Num 
            | V Var 
            | Mult Aexp Aexp
            | Add Aexp Aexp 
            | Sub Aexp Aexp 
             deriving (Show, Eq, Read)


arithParser :: Parser Aexp
arithParser = V <$> strParser
            <|> N <$> numParser
            <|> Mult <$> arithParser <* tok "*" <*> arithParser
--boolParser :: Parser Bexp


strParser :: Parser Var
strParser = tok "\"" *> some (noneOf ("\n\r\"=[]{},:")) <* tok "\""

numParser :: Parser Num
numParser = (some (oneOf ['0' .. '9']) >>= return . read) <* whitespace

If I run the command Parse arithParser "5*5" "5*5" it just returns Right (N 5), where it should return Mult(N 5) (N 5). Because of the precedence in the arithParser. But if I change the order then it seems to go into an infinite loop and crash.

Not sure what I'm doing wrong here, any help would be appreciated.

解决方案

Parsec tries the left alternative of <|> before it tries the right one. If the left alternative succeeds then it won't bother with the right one. So in this instance, when fed the input 5*5, Parsec's process looks like this:

  1. Try V <$> strParser. strParser begins with tok "\"", but the input string doesn't begin with '"' so strParser fails.
  2. Try N <$> numParser. numParser successfully parses the number 5, so Parsec just returns N 5.
  3. Done! No need to try the third alternative.


So we can attempt to patch this parser up by moving the Mult option up to the top, wrapped in a try so that it can backtrack and try numParser or strParser if the input turns out not to be a multiplication.

arithParser :: Parser Aexp
arithParser = try (Mult <$> arithParser <* tok "*" <*> arithParser)
            <|> N <$> numParser
            <|> V <$> strParser

This parser has another, more subtle problem. Let's walk through the steps, as above.

  1. Try try (Mult <$> arithParser <* tok "*" <*> arithParser). This parser begins with arithParser, so recursively call arithParser.
  2. Try try (Mult <$> arithParser <* tok "*" <*> arithParser). This parser begins with arithParser, so recursively call arithParser.
  3. Try try (Mult <$> arithParser <* tok "*" <*> arithParser). This parser begins with arithParser, so recursively call arithParser.
  4. ...

It's an infinite loop. Parsec can't handle left-recursive grammars. You have to design your parser so that it consumes at least one token before a recursive call. One common way of doing this is to "flatten out" your grammar:

expr, term :: Parser AExp
expr = do
    n <- term
    rest <- optional $ tok "*" *> expr
    return $ maybe n (Mult n) rest
term = N <$> numParser
    <|> V <$> strParser
    <|> parenthesised expr

parenthesised = between (char '(') (char ')')

Here I've split up the parser into one which parses an arbitrary expr - a term optionally followed by a multiplication symbol and a multiplicand expr - and one which parses single terms - numbers, strings, and parenthesised expressions. The recursive calls to expr are OK now - the one inside expr happens only after you've parsed a term (which always consumes input) and the one inside term happens only after you've parsed an opening parenthesis.

Note that expr has a list-like structure: it parses a single thing possibly followed by many things. In general you should think of parsers consuming a linear input stream of input tokens, so it's not surprising that list-shaped parsers tend to be more effective than tree-shaped ones.

The Text.Megaparsec.Expr module contains functions which package up this pattern and parse expressions with arbitrary precedence and fixity rules.

expr = makeExprParser term [[InfixR $ tok "*" $> Mult]]

这篇关于Megaparsec:无法解析算术字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆