Haskell Parsec - 错误消息在使用自定义令牌时不太有用 [英] Haskell Parsec - error messages are less helpful while using custom tokens

查看:144
本文介绍了Haskell Parsec - 错误消息在使用自定义令牌时不太有用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在分析解析器的lexing和解析阶段。经过一些测试后,我发现当我使用Parsec的Char标记以外的其他标记时,错误消息的帮助不大。



以下是使用Char时Parsec错误消息的一些示例代币:

  ghci> P.parseTest(字符串asdf>>空格>>字符串ok)asdf错误
解析错误(第1行,第7列):
意外w
期待空间或ok


ghci> P.parseTest(选择[字符串ok,字符串nop])错误
解析错误(第1行,第1列):
意外w
期待ok 或nop

所以,字符串解析器显示发现意外字符串时预期的字符串,选择解析器显示什么是选择。



但是当我使用与我的令牌相同的组合符号时:

  ghci的> Parser.parseTest((tok $ Ideasdf)>>(tok $ Ideok))asdf
在test(第1行,第1列)解析错误:
意外的输入结束

在这种情况下,它不会打印出预期结果。

  ghci> Parser.parseTest(选择[tok $ Ideok,tok $ Idenop])asdf
解析错误(第1行,第1列):
意外(Ideasdf, test(line 1,column 1))

当我使用



我期望这种行为与combinator函数有关,而不是与令牌相关,但似乎是我错了。我该如何解决这个问题?



下面是完整的词法分析器+解析器代码:

Lexer:



$ p $ module Lexer
(Token(..)
,TokenPos(..)
,tokenize


import Text.ParserCombinators.Parsec hiding(token,tokens)
import Control.Applicative((< *),(*>),(< $> ;),(*))

数据Token = Ide String
|数字字符串
|布尔字符串
| LBrack
| RBrack
| LBrace
| RBrace
|关键字字符串
deriving(Show,Eq)

类型TokenPos =(Token,SourcePos)

ide :: Parser TokenPos
ide = do
pos< - getPosition
fc< - oneOf firstChar
r< - optionMaybe(许多$ oneOf rest)
spaces
return $ flip(,)pos $ case r
无 - > Ide [fc]
只是s - > Ide $ [fc] ++ s
where firstChar = ['A'..'Z'] ++ ['a'..'z'] ++_
rest = firstChar + + ['0'..'9']

parsePos p =(,)< $> p * getPosition

lbrack = parsePos $ char'['>>返回LBrack
rbrack = parsePos $ char']'>>返回RBrack
lbrace = parsePos $ char'{'>>返回LBrace
rbrace = parsePos $ char'}'>>返回RBrace


令牌=选择
[
,lbrack
,rbrack
,lbrace
,rbrace
]

tokens =空格*>许多(令牌< *空格)

tokenize :: SourceName - >字符串 - > ParseError [TokenPos]
tokenize = runParser tokens()

解析器:

 模块解析器其中

将Text.Parsec作为P
导入Control.Monad.Identity
导入Lexer

parseTest :: Show a => Parsec [TokenPos]()a - >字符串 - > IO()
parseTest p s =
case tokenizetestof
Left e - > putStrLn $ show e
Right ts' - > P.parseTest p''

tok :: Token - > ParsecT [TokenPos]()身份令牌
tok t =令牌show snd test
其中test(t',_)= case t == t'of
False - > Nothing
True - >只要t

解决方案:

在fp4me的回答和仔细阅读Parsec的Char源代码之后,我最终得到了这个结论:

  { - #LANGUAGE FlexibleContexts# - } 
模块解析器其中

导入Text.Parsec作为P
导入Control.Monad.Identity
导入Lexer

parseTest ::显示a = > Parsec [TokenPos]()a - >字符串 - > IO()
parseTest p s =
case tokenizetestof
Left e - > putStrLn $ show e
Right ts' - > P.parseTest p''


类型Parser a = Parsec [TokenPos]()a

advance :: SourcePos - > t - > [TokenPos] - > SourcePos
advance _ _((_,pos):_)= pos
advance pos _ [] = pos

满足::(TokenPos - > Bool) - > ;解析器令牌
满足f = tokenPrim show
advance
(\c - > if fc then Just(fst c)else Nothing)

tok :: Token - > ParsecT [TokenPos]()身份令牌
tok t =(Parser.satisfy $(== t).fst)<>显示t

现在我收到相同的错误消息:


ghci> Parser.parseTest(选择[tok $ Ideok,tok $nop])asdf

解析错误at(line 1,第1列):
意外(Ideasdf,test(第1行,第3列))

期待Ideok或Idenop p>


解决方案

解决方案的开始可以是在解析器中定义您的选择函数
使用特定的意外函数来覆盖意外错误,最后
使用<?> 运算符来覆盖期望消息:

  mychoice [] = mzero 
mychoice(x:[])=(tok x< |> myUnexpected)<?>显示x
mychoice(x:xs)=((tok x< |> mychoice xs)< |> myUnexpected)<?> show(x:xs)

myUnexpected = do
input< - getInput
unexpected $(id $ first input)
where
first [] =eof
first(x:xs)= show $ fst x

你的解析器是这样的:

  ghci> Parser.parseTest(mychoice [Ideok,Idenop])asdf
解析错误(第1行,第1列):
意外Ideasdf
期待[ Ideok,Idenop]


I'm working on seperating lexing and parsing stages of a parser. After some tests, I realized error messages are less helpful when I'm using some tokens other than Parsec's Char tokens.

Here are some examples of Parsec's error messages while using Char tokens:

ghci> P.parseTest (string "asdf" >> spaces >> string "ok") "asdf  wrong"
parse error at (line 1, column 7):
unexpected "w"
expecting space or "ok"


ghci> P.parseTest (choice [string "ok", string "nop"]) "wrong"
parse error at (line 1, column 1):
unexpected "w"
expecting "ok" or "nop"

So, string parser shows what string is expected when found an unexpected string, and choice parser shows what are alternatives.

But when I use same combinators with my tokens:

ghci> Parser.parseTest ((tok $ Ide "asdf") >> (tok $ Ide "ok")) "asdf  "
parse error at "test" (line 1, column 1):
unexpected end of input

In this case, it doesn't print what was expected.

ghci> Parser.parseTest (choice [tok $ Ide "ok", tok $ Ide "nop"]) "asdf  "
parse error at (line 1, column 1):
unexpected (Ide "asdf","test" (line 1, column 1))

And when I use choice, it doesn't print alternatives.

I expect this behavior to be related with combinator functions, and not with tokens, but seems like I'm wrong. How can I fix this?

Here's the full lexer + parser code:

Lexer:

module Lexer
    ( Token(..)
    , TokenPos(..)
    , tokenize
    ) where

import Text.ParserCombinators.Parsec hiding (token, tokens)
import Control.Applicative ((<*), (*>), (<$>), (<*>))

data Token = Ide String
           | Number String
           | Bool String
           | LBrack
           | RBrack
           | LBrace
           | RBrace
           | Keyword String
    deriving (Show, Eq)

type TokenPos = (Token, SourcePos)

ide :: Parser TokenPos
ide = do
    pos <- getPosition
    fc  <- oneOf firstChar
    r   <- optionMaybe (many $ oneOf rest)
    spaces
    return $ flip (,) pos $ case r of
                 Nothing -> Ide [fc]
                 Just s  -> Ide $ [fc] ++ s
  where firstChar = ['A'..'Z'] ++ ['a'..'z'] ++ "_"
        rest      = firstChar ++ ['0'..'9']

parsePos p = (,) <$> p <*> getPosition

lbrack = parsePos $ char '[' >> return LBrack
rbrack = parsePos $ char ']' >> return RBrack
lbrace = parsePos $ char '{' >> return LBrace
rbrace = parsePos $ char '}' >> return RBrace


token = choice
    [ ide
    , lbrack
    , rbrack
    , lbrace
    , rbrace
    ]

tokens = spaces *> many (token <* spaces)

tokenize :: SourceName -> String -> Either ParseError [TokenPos]
tokenize = runParser tokens ()

Parser:

module Parser where

import Text.Parsec as P
import Control.Monad.Identity
import Lexer

parseTest  :: Show a => Parsec [TokenPos] () a -> String -> IO ()
parseTest p s =
    case tokenize "test" s of
        Left e -> putStrLn $ show e
        Right ts' -> P.parseTest p ts'

tok :: Token -> ParsecT [TokenPos] () Identity Token
tok t = token show snd test
  where test (t', _) = case t == t' of
                           False -> Nothing
                           True  -> Just t

SOLUTION:

Ok, after fp4me's answer and reading Parsec's Char source more carefully, I ended up with this:

{-# LANGUAGE FlexibleContexts #-}
module Parser where

import Text.Parsec as P
import Control.Monad.Identity
import Lexer

parseTest  :: Show a => Parsec [TokenPos] () a -> String -> IO ()
parseTest p s =
    case tokenize "test" s of
        Left e    -> putStrLn $ show e
        Right ts' -> P.parseTest p ts'


type Parser a = Parsec [TokenPos] () a

advance :: SourcePos -> t -> [TokenPos] -> SourcePos
advance _ _ ((_, pos) : _) = pos
advance pos _ [] = pos

satisfy :: (TokenPos -> Bool) -> Parser Token
satisfy f = tokenPrim show
                      advance
                      (\c -> if f c then Just (fst c) else Nothing)

tok :: Token -> ParsecT [TokenPos] () Identity Token
tok t = (Parser.satisfy $ (== t) . fst) <?> show t

Now I'm getting same error messages:

ghci> Parser.parseTest (choice [tok $ Ide "ok", tok $ Ide "nop"]) " asdf"
parse error at (line 1, column 1):
unexpected (Ide "asdf","test" (line 1, column 3))
expecting Ide "ok" or Ide "nop"

解决方案

A beginning of solution can be to define your choice function in the Parser, use a specific unexpected function to override unexpected error and finally use the <?> operator to override the expecting message:

mychoice [] = mzero
mychoice (x:[]) = (tok x <|> myUnexpected) <?> show x 
mychoice (x:xs) = ((tok x <|> mychoice xs) <|> myUnexpected)  <?> show (x:xs)

myUnexpected =  do 
             input <- getInput 
             unexpected $ (id $ first input )
           where 
            first [] = "eof"
            first (x:xs) = show $ fst x

and call your parser like that :

ghci> Parser.parseTest (mychoice [Ide "ok", Ide "nop"]) "asdf  "
parse error at (line 1, column 1):
unexpected Ide "asdf"
expecting [Ide "ok",Ide "nop"]

这篇关于Haskell Parsec - 错误消息在使用自定义令牌时不太有用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆