Haskell的parsec中CPS与非CPS解析器的堆使用情况 [英] heap usage for CPS vs non-CPS parsers in Haskell's parsec

查看：131 发布时间：2020/5/25 1:16:51 parsing haskell heap-memory parsec continuation-passing

本文介绍了Haskell的parsec中CPS与非CPS解析器的堆使用情况的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 parsec 编写以下解析器:

I'm trying to write the following parser using parsec:

manyLength
  :: forall s u m a.
     Monad m
  => ParsecT s u m a -> ParsecT s u m Int
manyLength p = go 0
  where
    go :: Int -> ParsecT s u m Int
    go !i = (p *> go (i + 1)) <|> pure i

这就像 many 函数，但是它没有返回[a] 返回Parser a成功的次数.

This is like the many function, but instead of returning [a], it returns the number of times Parser a succeeds.

这可行，但是我似乎无法使其在恒定的堆空间中运行.这使得在某种意义上说，因为对go的递归调用不在尾部调用位置.

This works, but I can't seem to make it run in constant heap space. This makes sense, since the recursive call to go is not in the tail-call position.

如果parsec将构造函数导出到 manyAccum 功能:

If parsec would export the constructor to ParsecT, it would be possible to rewrite manyLength in CPS'ed form. This is very similar to the manyAccum function:

manyLengthCPS :: forall s u m a. ParsecT s u m a -> ParsecT s u m Int
manyLengthCPS p = ParsecT f
  where
    f
      :: forall b.
         State s u
      -> (Int -> State s u -> ParseError -> m b) -- consumed ok
      -> (ParseError -> m b)                     -- consumed err
      -> (Int -> State s u -> ParseError -> m b) -- empty ok
      -> (ParseError -> m b)                     -- empty err
      -> m b
    f s cok cerr eok _ =
      let walk :: Int -> a -> State s u -> ParseError -> m b
          walk !i _ s' _ =
            unParser p s'
              (walk $ i + 1)            -- consumed-ok
              cerr                      -- consumed-err
              manyLengthCPSErr          -- empty-ok
              (\e -> cok (i + 1) s' e)  -- empty-err
      in unParser p s (walk 0) cerr manyLengthCPSErr (\e -> eok 0 s e)
    {-# INLINE f #-}

manyLengthCPSErr :: Monad m => m a
manyLengthCPSErr =
  fail "manyLengthCPS can't be used on parser that accepts empty input"

此manyLengthCPS函数确实在恒定的堆空间中运行.

This manyLengthCPS function does run in constant heap space.

这里是ParsecT构造函数，仅出于完整性考虑:

Here is the ParsecT constructor just for completeness:

newtype ParsecT s u m a = ParsecT
  { unParser
      :: forall b .
         State s u
      -> (a -> State s u -> ParseError -> m b) -- consumed ok
      -> (ParseError -> m b)                   -- consumed err
      -> (a -> State s u -> ParseError -> m b) -- empty ok
      -> (ParseError -> m b)                   -- empty err
      -> m b
  }

我还尝试使用manyLengthCPS直接将其转换为非CPS功能低级功能:

I also tried to turn manyLengthCPS directly into a non-CPS'ed function using the low-level mkPT function:

manyLengthLowLevel
  :: forall s u m a.
     Monad m
  => ParsecT s u m a -> ParsecT s u m Int
manyLengthLowLevel p = mkPT f
  where
    f :: State s u -> m (Consumed (m (Reply s u Int)))
    f parseState = do
      consumed <- runParsecT p parseState
      case consumed of
        Empty mReply -> do
          reply <- mReply
          case reply of
            Ok _ _ _ -> manyLengthErr
            Error parseErr -> pure . Empty . pure $ Ok 0 parseState parseErr
        Consumed mReply -> do
          reply <- mReply
          case reply of
            Ok a newState parseErr -> walk 0 a newState parseErr
            Error parseErr -> pure . Consumed . pure $ Error parseErr
      where
        walk
          :: Int
          -> a
          -> State s u
          -> ParseError
          -> m (Consumed (m (Reply s u Int)))
        walk !i _ parseState' _ = do
          consumed <- runParsecT p parseState'
          case consumed of
            Empty mReply -> do
              reply <- mReply
              case reply of
                Ok _ _ _ -> manyLengthErr
                Error parseErr ->
                  pure . Consumed . pure $ Ok (i + 1) parseState' parseErr
            Consumed mReply -> do
              reply <- mReply
              case reply of
                Ok a newState parseErr -> walk (i + 1) a newState parseErr
                Error parseErr -> pure . Consumed . pure $ Error parseErr

manyLengthErr :: Monad m => m a
manyLengthErr =
  fail "manyLengthLowLevel can't be used on parser that accepts empty input"

就像manyLength一样，manyLengthLowLevel不在恒定的堆空间中运行.

Just like manyLength, manyLengthLowLevel doesn't run in constant heap space.

是否可以编写manyLength，使其即使在恒定的堆空间中也可以运行没有以CPS风格编写?如果没有，为什么不呢?有一些基本的为什么可以使用CPS样式而不是非CPS样式?

Is it possible to write manyLength so it runs in constant heap space even without writing it in CPS-style? If not, why not? Is there some fundamental reason that it is possible in CPS-style but not in non-CPS-style?

Haskell的parsec中CPS与非CPS解析器的堆使用情况 [英] heap usage for CPS vs non-CPS parsers in Haskell's parsec

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Haskell的parsec中CPS与非CPS解析器的堆使用情况 [英] heap usage for CPS vs non-CPS parsers in Haskell&#39;s parsec

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

Haskell的parsec中CPS与非CPS解析器的堆使用情况 [英] heap usage for CPS vs non-CPS parsers in Haskell's parsec

登录关闭