在Haskell中从头开始编写解析器 [英] Writing a parser from scratch in Haskell

查看:74
本文介绍了在Haskell中从头开始编写解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我发现:


$ b $有没有很好的教程,可以从头开始为Haskell中的给定语法编写解析器? b


  1. 解析表达式和语句(HaskellWiki) 解析一个简单的命令式语言(HaskellWiki )


  2. 使用parsec(真实世界Haskell)


但所有这些都使用parsec库,并且虽然这对工业应用程序可能很有意思,但我特别寻找不使用复杂库的'自下而上'的例子。



我发现了唯一一个使用'基本'Haskell是这样的:用Haskell解析
使用一些非常外语的语法(很难区分程序的什么部分或什么是'伪代码')和没有明确的语法定义。



任何建议都非常感谢!

从零开始构建Parsec实际上非常容易。实际的库代码本身是大量泛化和优化的,它扭曲了核心抽象,但如果你只是从头开始构建一些东西来了解更多关于发生的事情,你可以用几行代码来编写它。我将在下面构建一个稍微弱一点的 Applicative 解析器。



实际上,我们想要生成一个 Applicative Parser 以及一个原始解析器值

  satisfy ::(Char  - > Bool) - >解析器Char 

和一些组合器,如 try ,如果失败则解除解析器

  try :: Parser a  - >解析器a 

orElse 这可以让我们如果第一个解析器失败,则继续使用第二个解析器。通常这实际上是用中缀组合器(< |>)

  orElse,(< |>):: Parser a  - >解析器a  - >解析器a 

由于我们的 Applicative 需要保留跟踪当前流状态并能够失败,我们将通过组合状态 Applicative Either 应用。

  type Error = String 
newtype解析器a = P {unP :: String - > (字符串,或者错误a)}

实例Functor解析器其中
fmap f(P st)= P $ \stream - >案例st
(res,Left err) - > (res,Left err)
(res,Right a) - > (res,Right(fa))

实例适用解析器其中
pure a = P(\ stream - >(stream,Right a))
P ff< * GT; P xx = P $ \stream0 - >产生f
(stream1,Left err) - >的情况。 (stream1,Left err)
(stream1,Right f) - > case xx stream1 of - 产生一个x
(stream2,Left err) - > (stream2,Left err)
(stream2,Right x) - > (stream2,Right(fx)) - return(fx)

如果我们遵循<$ c在 Applicative 实例中的$ c>(< *>)方法中,我们仔细地看到它只是将流传入 f --producing Parser ,接受结果流,如果成功,将它传递给 x -producing Parser ,如果它们都成功,它将返回它们的应用程序(fx)。这意味着如果我们有函数生成分析器和参数生成分析器,我们可以用(<>)

   - 给出
parseChar :: Char - >解析器Char
$ b parseHi :: Parser(Char,Char) - 解析'h'然后解析'i'
parseHi = pure(,)< $> parseChar'h'*> parseChar'i'

我们可以使用 Applicative 来构建所需的组合器。这是满足

   -  |查看下一个字符并成功返回,如果它满足谓词
满足::(Char - > Bool) - >解析器Char
满足f = P $ \stream - >
[]的案例流 - >> ([],左边流)
(c:cs)| f c - > (cs,Right c)
|否则 - > (cs,Left不满足)

这里是 try

   -  |运行解析器,但如果失败则将流恢复为原始状态
try :: Parser a - >解析器a
try(P f)= P $ \stream0 - > case $ stream $
(_,Left err) - > (stream0,Left err)
(stream1,Right a) - > (stream1,Right a)

以下是 orElse

  orElse :: Parser a  - >解析器a  - >解析器a 
orElse(P f1)(P f2)= P $ \stream0 - >
(stream1,Left err)的情况f1 stream0 - > f2 stream1
(stream1,Right a) - > (stream1,Right a)

通常在这一点上我们还注意到 Parser如果我们还提供了一个立即失败的解析器,就会用或者形成一个替代 c $ c> empty

  instance Alternative Parser其中
empty = P $ \流 - > (stream,Leftempty)
(< |>)= orElse

many = manyParser
some = someParser

我们可以写 manyParser someParser 作为重复运行解析器的组合器。

   -  | 0或更多
manyParser :: Parser a - > Parser [a]
manyParser(P f)= P go其中
go stream = case f stream of
(_,Left err) - > (stream,Right []) - 抛出错误
(stream',Right a) - >案例转到'
(streamFin,Left err) - > (streamFin,Left err)
(streamFin,Right as) - > (streamFin,Right(a:as))

- | 1或更多
someParser :: Parser a - >解析器[a]
someParser(P f)= P $ \ stream - >案例f
(stream',Left err) - > (stream',Left err)
(stream',Right a) - >
let(P fmany)= manyParser(P f)

(stream'',Left err) - > (stream'',Left err)
(stream'',Right as) - > (stream'',Right(a:as))

从这里我们可以开始工作更高级别的抽象。

  char :: Char  - >解析器char 
char c =满足(== c)

string :: String - >解析器字符串
string [] = pure []
string(c:cs)=(:)< $> char c *字符串cs

oneOf :: [Char] - >解析器Char
oneOf cs =满足(`elem` cs)

parens ::解析器a - >解析器a
parens parseA = dropFirstAndLast< $> char'('*'parseA *'char')'
其中
dropFirstAndLast _ a _ = a


Is there any good tutorial for writing a parser for a given grammar in Haskell from scratch?

I found:

  1. parsing expressions and statements (HaskellWiki)

  2. Parsing a simple imperative language (HaskellWiki)

  3. Using parsec (Real World Haskell)

but all of these use the parsec library, and while that may be interesting for industrial applications I'm specifically looking for examples that are 'bottom up' without the use of sophisticated libraries.

The only one I found which uses 'basic' Haskell is this: Parsing with Haskell which uses some very foreign syntax (it's very hard to distinguish whats part of the program or whats only 'pseudocode') and there's no explicit grammar definition.

Any suggestions are highly appreciated!

解决方案

It's actually surprisingly easy to build Parsec-from-scratch. The actual library code itself is heavily generalized and optimized which contorts the core abstraction, but if you're just building things from scratch to understand more about what's going on you can write it in just a few lines of code. I'll build a slightly weaker Applicative parser below.

Essentially, we want to produce a Applicative, Parser along with a primitive parser value

satisfy :: (Char -> Bool) -> Parser Char

and a few combinators such as try, which "undoes" a parser if it fails

try :: Parser a -> Parser a

and orElse which allows us to continue with a second parser if the first parser fails. Usually this is actually written with the infix combinator (<|>)

orElse, (<|>) :: Parser a -> Parser a -> Parser a

Since our Applicative needs to keep track of the current stream state and be able to fail, we'll build it by combining the state Applicative and the Either applicative.

type Error = String
newtype Parser a = P { unP :: String -> (String, Either Error a) }

instance Functor Parser where
  fmap f (P st) = P $ \stream -> case st stream of
    (res, Left err) -> (res, Left err)
    (res, Right a ) -> (res, Right (f a))

instance Applicative Parser where
  pure a = P (\stream -> (stream, Right a))
  P ff <*> P xx = P $ \stream0 -> case ff stream0 of   -- produce an f
    (stream1, Left err) -> (stream1, Left err)
    (stream1, Right f ) -> case xx stream1 of          -- produce an x
      (stream2, Left err) -> (stream2, Left err)
      (stream2, Right x ) -> (stream2, Right (f x))    -- return (f x)

If we follow the (<*>) method in the Applicative instance carefully we see that it just passes the stream into the f-producing Parser, takes the result stream and, if successful, passes it to the x-producing Parser and if they both succeed it returns their application (f x). This means that if we have a function-producing parser and an argument-producing parser we can sequence them with (<*>)

-- given
parseChar :: Char -> Parser Char

parseHi   :: Parser (Char, Char) -- parses 'h' then 'i'
parseHi = pure (,) <$> parseChar 'h' <*> parseChar 'i'

We can use the mechanics of this Applicative to build the needed combinators as well. Here's satisfy

-- | Peek at the next character and return successfully if it satisfies a predicate
satisfy :: (Char -> Bool) -> Parser Char
satisfy f = P $ \stream -> case stream of
  []                 -> ([], Left "end of stream")
  (c:cs) | f c       -> (cs, Right c)
         | otherwise -> (cs, Left "did not satisfy")

And here's try

-- | Run a parser but if it fails revert the stream to it's original state
try :: Parser a -> Parser a
try (P f) = P $ \stream0 -> case f stream0 of
  (_      , Left err) -> (stream0, Left err)
  (stream1, Right a ) -> (stream1, Right a )

And here's orElse

orElse :: Parser a -> Parser a -> Parser a
orElse (P f1) (P f2) = P $ \stream0 -> case f1 stream0 of
  (stream1, Left err) -> f2 stream1
  (stream1, Right a ) -> (stream1, Right a)

Typically at this point we also note that Parser forms an Alternative instance with orElse if we also provide an immediately failing parser, empty

instance Alternative Parser where
  empty = P $ \stream -> (stream, Left "empty")
  (<|>) = orElse

  many = manyParser
  some = someParser

And we can write manyParser and someParser as combinators which run a parser repeatedly.

-- | 0 or more
manyParser :: Parser a -> Parser [a]
manyParser (P f) = P go where
  go stream = case f stream of
    (_      , Left err) -> (stream, Right [])  -- throws away the error
    (stream', Right a ) -> case go stream' of 
      (streamFin, Left err) -> (streamFin, Left err)
      (streamFin, Right as) -> (streamFin, Right (a : as))

-- | 1 or more
someParser :: Parser a -> Parser [a]
someParser (P f) = P $ \stream -> case f stream of
  (stream', Left err) -> (stream', Left err)
  (stream', Right a ) -> 
    let (P fmany) = manyParser (P f)
    in case fmany stream' of
      (stream'', Left err) -> (stream'', Left err)
      (stream'', Right as) -> (stream'', Right (a:as))

And from here we can begin to work at much higher levels of abstraction.

char :: Char -> Parser Char
char c = satisfy (== c)

string :: String -> Parser String
string []     = pure []
string (c:cs) = (:) <$> char c <*> string cs

oneOf :: [Char] -> Parser Char
oneOf cs = satisfy (`elem` cs)

parens :: Parser a -> Parser a
parens parseA = dropFirstAndLast <$> char '(' <*> parseA <*> char ')'
  where
    dropFirstAndLast _ a _ = a

这篇关于在Haskell中从头开始编写解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆