完整的解析器例子与parsec? [英] Full parser examples with parsec?

查看:145
本文介绍了完整的解析器例子与parsec?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图为一个简单的函数式语言创建一个解析器,有点像Caml,但我似乎被最简单的东西卡住了。



所以我我想知道是否有一些更完整的 parsec 解析器的例子,超越了这就是你解析2 + 3的方式。特别是函数调用等等。



我读过给你写一个Scheme,但是Scheme的语法非常简单,并不能真正帮助学习。



我遇到的最多问题是如何使用 try < |> choice 正确,因为我真的不明白为什么parsec似乎永远不会解析 a(6)作为使用此解析器的函数调用:

  expr = choice [number,call,ident] 

number = liftM Number float<?> Number

ident = liftM标识符标识符<?> 标识符

call = do
名称< - 标识符
args< - parens $ commaSep expr
return $ FuncCall name args
< ?> 函数调用

编辑添加了一些完成代码,尽管这是实际上不是我问的问题:

AST.hs

 模块AST其中

数据AST
=数字Double
|标识符字符串
|操作BinOp AST AST
| FuncCall String [AST]
派生(Show,Eq)

数据BinOp = Plus |减| | Mul | Div
派生(Show,Eq,Enum)

Lexer。

 模块Lexer(
标识符,保留,运算符,reservedOp,charLiteral,stringLiteral ,
自然,整数,浮点,naturalOrFloat,十进制,十六进制,八进制,
符号,lexeme,whiteSpace,parens,括号,角度,括号,半角,
逗号,冒号,点,半声,semiSep1,commaSep,commaSep1
)其中

import Text.Parsec
将合格的Text.Parsec.Token导入为P
import Text.Parsec.Language(haskellStyle)

lexer = P.makeTokenParser haskellStyle

identifier = P.identifier lexer
reserved = P.reserved lexer
operator = P.operator lexer
reservedOp = P.reservedOp lexer
charLiteral = P.charLiteral lexer
stringLiteral = P.stringLiteral lexer
natural = P.natural lexer
integer = P.integer lexer
float = P.float lexer
naturalOrFloat = P.naturalOrFloat lexer
decimal = P.decimal lexer
hexadecimal = P.hexadecimal lexer
octal = P.octal lexer
symbol = P.symbol lexer
lexeme = P.lexeme lexer
whiteSpace = P.whiteSpace lexer
parens = P.parens lexer
大括号= P.braces lexer
angles = P.angles lexer
括号= P.brackets lexer
semi = P.semi lexer
comma = P.comma lexer
colon = P.colon lexer
dot = P.dot lexer
semiSep = P.semiSep lexer
semiSep1 = P.semiSep1 lexer
commaSep = P.commaSep lexer
commaSep1 = P.commaSep1 lexer

Parser.hs

  module解析器其中

导入Control.Monad(liftM)
导入Text.Parsec
导入Text.Parsec.String(解析器)
导入Lexer
进口AST

expr =编号< |> callOrIdent

number = liftM Number float<?> Number

callOrIdent = do
name< - identifier
liftM(FuncCall name)(parens $ commaSep expr)< |>返回(标识符名称)


解决方案



  * Expr> parse expra(6)
Right(FuncCalla[Number 6.0])


$ b $



编辑:通过编写我自己的文章填写缺失的部分 float 解析器,它可以解析整数文字。另一方面,来自 Text.Parsec.Token float 解析器仅解析带小数部分或指数的文字,因此解析6失败。



然而,

  * Expr>解析表达式变量
左边(第1行,第9列):
意外结束输入
期望(

在解析标识符后调用失败时,输入的那部分被消耗,因此ident未被尝试,并且整体解析失败。它在 expr 的选择列表中<>尝试调用,以便调用失败而不消耗输入,或者b)编写解析器callOrIdent用于 expr 中,例如

  callOrIdent = do 
名称< - 标识符
liftM(FuncCall名称)(parens $ commaSep expr)< |>返回(标识符名称)

避免 try 因此可能表现更好。


I'm trying to make a parser for a simple functional language, a bit like Caml, but I seem to be stuck with the simplest things.

So I'd like to know if there are some more complete examples of parsec parsers, something that goes beyond "this is how you parse 2 + 3". Especially function calls in terms and suchlike.

And I've read "Write you a Scheme", but the syntax of scheme is quite simple and not really helping for learning.

The most problems I have is how to use try, <|> and choice properly, because I really don't get why parsec never seems to parse a(6) as a function call using this parser:

expr = choice [number, call, ident]

number = liftM Number float <?> "Number"

ident = liftM Identifier identifier <?> "Identifier"

call = do
    name <- identifier
    args <- parens $ commaSep expr
    return $ FuncCall name args
    <?> "Function call"

EDIT Added some code for completion, though this is actually not the thing I asked:

AST.hs

module AST where

data AST
    = Number Double
    | Identifier String
    | Operation BinOp AST AST
    | FuncCall String [AST]
    deriving (Show, Eq)

data BinOp = Plus | Minus | Mul | Div
    deriving (Show, Eq, Enum)

Lexer.hs

module Lexer (
            identifier, reserved, operator, reservedOp, charLiteral, stringLiteral,
            natural, integer, float, naturalOrFloat, decimal, hexadecimal, octal,
            symbol, lexeme, whiteSpace, parens, braces, angles, brackets, semi,
            comma, colon, dot, semiSep, semiSep1, commaSep, commaSep1
    ) where

import Text.Parsec
import qualified Text.Parsec.Token as P
import Text.Parsec.Language (haskellStyle)

lexer = P.makeTokenParser haskellStyle

identifier = P.identifier lexer
reserved = P.reserved lexer
operator = P.operator lexer
reservedOp = P.reservedOp lexer
charLiteral = P.charLiteral lexer
stringLiteral = P.stringLiteral lexer
natural = P.natural lexer
integer = P.integer lexer
float = P.float lexer
naturalOrFloat = P.naturalOrFloat lexer
decimal = P.decimal lexer
hexadecimal = P.hexadecimal lexer
octal = P.octal lexer
symbol = P.symbol lexer
lexeme = P.lexeme lexer
whiteSpace = P.whiteSpace lexer
parens = P.parens lexer
braces = P.braces lexer
angles = P.angles lexer
brackets = P.brackets lexer
semi = P.semi lexer
comma = P.comma lexer
colon = P.colon lexer
dot = P.dot lexer
semiSep = P.semiSep lexer
semiSep1 = P.semiSep1 lexer
commaSep = P.commaSep lexer
commaSep1 = P.commaSep1 lexer

Parser.hs

module Parser where

import Control.Monad (liftM)
import Text.Parsec
import Text.Parsec.String (Parser)
import Lexer
import AST

expr = number <|> callOrIdent

number = liftM Number float <?> "Number"

callOrIdent = do
    name <- identifier
    liftM (FuncCall name) (parens $ commaSep expr) <|> return (Identifier name)

解决方案

Hmm,

*Expr> parse expr "" "a(6)"
Right (FuncCall "a" [Number 6.0])

that part works for me after filling out the missing pieces.

Edit: I filled out the missing pieces by writing my own float parser, which could parse integer literals. The float parser from Text.Parsec.Token on the other hand, only parses literals with a fraction part or an exponent, so it failed parsing the "6".

However,

*Expr> parse expr "" "variable"
Left (line 1, column 9):
unexpected end of input
expecting "("

when call fails after having parsed an identifier, that part of the input is consumed, hence ident isn't tried, and the overall parse fails. You can a) make it try call in the choice list of expr, so that call fails without consuming input, or b) write a parser callOrIdent to use in expr, e.g.

callOrIdent = do
    name <- identifier
    liftM (FuncCall name) (parens $ commaSep expr) <|> return (Identifier name)

which avoids try and thus may perform better.

这篇关于完整的解析器例子与parsec?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆