Haskell的parsec中CPS与非CPS解析器的堆使用情况 [英] heap usage for CPS vs non-CPS parsers in Haskell's parsec

查看:131
本文介绍了Haskell的parsec中CPS与非CPS解析器的堆使用情况的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 parsec 编写以下解析器:

I'm trying to write the following parser using parsec:

manyLength
  :: forall s u m a.
     Monad m
  => ParsecT s u m a -> ParsecT s u m Int
manyLength p = go 0
  where
    go :: Int -> ParsecT s u m Int
    go !i = (p *> go (i + 1)) <|> pure i

这就像 many 函数,但是它没有返回[a] 返回Parser a成功的次数.

This is like the many function, but instead of returning [a], it returns the number of times Parser a succeeds.

这可行,但是我似乎无法使其在恒定的堆空间中运行.这使得 在某种意义上说,因为对go的递归调用不在尾部调用位置.

This works, but I can't seem to make it run in constant heap space. This makes sense, since the recursive call to go is not in the tail-call position.

如果parsec将构造函数导出到 manyAccum 功能:

If parsec would export the constructor to ParsecT, it would be possible to rewrite manyLength in CPS'ed form. This is very similar to the manyAccum function:

manyLengthCPS :: forall s u m a. ParsecT s u m a -> ParsecT s u m Int
manyLengthCPS p = ParsecT f
  where
    f
      :: forall b.
         State s u
      -> (Int -> State s u -> ParseError -> m b) -- consumed ok
      -> (ParseError -> m b)                     -- consumed err
      -> (Int -> State s u -> ParseError -> m b) -- empty ok
      -> (ParseError -> m b)                     -- empty err
      -> m b
    f s cok cerr eok _ =
      let walk :: Int -> a -> State s u -> ParseError -> m b
          walk !i _ s' _ =
            unParser p s'
              (walk $ i + 1)            -- consumed-ok
              cerr                      -- consumed-err
              manyLengthCPSErr          -- empty-ok
              (\e -> cok (i + 1) s' e)  -- empty-err
      in unParser p s (walk 0) cerr manyLengthCPSErr (\e -> eok 0 s e)
    {-# INLINE f #-}

manyLengthCPSErr :: Monad m => m a
manyLengthCPSErr =
  fail "manyLengthCPS can't be used on parser that accepts empty input"

manyLengthCPS函数确实在恒定的堆空间中运行.

This manyLengthCPS function does run in constant heap space.

这里是ParsecT构造函数,仅出于完整性考虑:

Here is the ParsecT constructor just for completeness:

newtype ParsecT s u m a = ParsecT
  { unParser
      :: forall b .
         State s u
      -> (a -> State s u -> ParseError -> m b) -- consumed ok
      -> (ParseError -> m b)                   -- consumed err
      -> (a -> State s u -> ParseError -> m b) -- empty ok
      -> (ParseError -> m b)                   -- empty err
      -> m b
  }

我还尝试使用manyLengthCPS直接将其转换为非CPS功能 低级 功能:

I also tried to turn manyLengthCPS directly into a non-CPS'ed function using the low-level mkPT function:

manyLengthLowLevel
  :: forall s u m a.
     Monad m
  => ParsecT s u m a -> ParsecT s u m Int
manyLengthLowLevel p = mkPT f
  where
    f :: State s u -> m (Consumed (m (Reply s u Int)))
    f parseState = do
      consumed <- runParsecT p parseState
      case consumed of
        Empty mReply -> do
          reply <- mReply
          case reply of
            Ok _ _ _ -> manyLengthErr
            Error parseErr -> pure . Empty . pure $ Ok 0 parseState parseErr
        Consumed mReply -> do
          reply <- mReply
          case reply of
            Ok a newState parseErr -> walk 0 a newState parseErr
            Error parseErr -> pure . Consumed . pure $ Error parseErr
      where
        walk
          :: Int
          -> a
          -> State s u
          -> ParseError
          -> m (Consumed (m (Reply s u Int)))
        walk !i _ parseState' _ = do
          consumed <- runParsecT p parseState'
          case consumed of
            Empty mReply -> do
              reply <- mReply
              case reply of
                Ok _ _ _ -> manyLengthErr
                Error parseErr ->
                  pure . Consumed . pure $ Ok (i + 1) parseState' parseErr
            Consumed mReply -> do
              reply <- mReply
              case reply of
                Ok a newState parseErr -> walk (i + 1) a newState parseErr
                Error parseErr -> pure . Consumed . pure $ Error parseErr

manyLengthErr :: Monad m => m a
manyLengthErr =
  fail "manyLengthLowLevel can't be used on parser that accepts empty input"

就像manyLength一样,manyLengthLowLevel不在恒定的堆空间中运行.

Just like manyLength, manyLengthLowLevel doesn't run in constant heap space.

是否可以编写manyLength,使其即使在恒定的堆空间中也可以运行 没有以CPS风格编写?如果没有,为什么不呢?有一些基本的 为什么可以使用CPS样式而不是非CPS样式?

Is it possible to write manyLength so it runs in constant heap space even without writing it in CPS-style? If not, why not? Is there some fundamental reason that it is possible in CPS-style but not in non-CPS-style?

推荐答案

这在恒定的堆空间中运行.这个想法是先尝试p,然后对成功执行的结果进行明确的案例分析,以决定是否运行go,从而使go最终处于尾声调用位置.

This runs in constant heap space. The idea is to first try p, and explicitly perform case analysis on the result of its success to decide whether to run go or not, so that go ends up in tail call position.

manyLength
  :: Monad m
  => ParsecT s u m a -> ParsecT s u m Int
manyLength p = go 0
  where
    go :: Int -> ParsecT s u m Int
    go !i = do
      success <- (p *> pure True) <|> pure False
      if success then go (i+1) else pure i

这篇关于Haskell的parsec中CPS与非CPS解析器的堆使用情况的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆