不可辩驳的模式不会在递归中泄漏内存,但为什么? [英] Irrefutable pattern does not leak memory in recursion, but why?

查看:149
本文介绍了不可辩驳的模式不会在递归中泄漏内存,但为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的代码块中的 mapAndSum 函数结合了 map sum (不要介意在主函数中应用另一个 sum ,它只是使输出变得紧凑)。 map 是延迟计算的,而 sum 是使用累积参数计算的。这个想法是, map 的结果可以在内存中没有完整列表的情况下被使用,并且(仅)之后 sum 可免费使用。主函数表示在调用 mapAndSum 时我们遇到了无法辩驳的模式问题。让我解释这个问题。

根据Haskell标准,无可辩驳的模式示例 let(xs,s)= mapAndSum ... in print xs>> print 被翻译成

 (\ v  - > print(case v of {( xs,s) - > xs})
>> print({(xs,s) - > s}的情况v))
$ mapAndSum ...

因此, print 调用都会引用整个对,我们(我的同事Toni Dietze和我)使用明确的案例解决了这个问题。



code>语句(比较坏vsgood2)。顺便说一句,发现这一点花了我们相当多的时间..!

现在,我们不明白的是双重的:




  • 为什么 mapAndSum 起作用?它也使用了一个无可辩驳的模式,所以它也应该将整个列表保存在内存中,但显然不会。并且将 let 转换为 case 将会使该函数的行为完全没有延迟(直到堆栈溢出;否

    我们看了GHC生成的核心代码,但据我们的理解,它实际上包含了 let 作为上面的那个。所以在这里没有线索,而是更加困惑。

  • 为什么坏?在关闭优化时工作,但在打开时不工作? 想要实现大文本文件的流处理(格式转换),同时还要累积一些值,然后将其写入单独的文件。如前所述,我们成功了,但仍然存在两个问题,答案可能会提高我们对即将到来的任务和问题的GHC的了解。



    谢谢!

      { - #LANGUAGE BangPatterns# - } 

    - 经过格拉斯哥Haskell编译系统7.4.2版的测试。

    模块Main其中


    导入Control.Arrow(第一个)
    导入Data.List(foldl')
    导入System.Environment (getArgs)


    mapAndSum :: Num a => (a - > b) - > [a] - > ([b],a)
    mapAndSum f = go 0
    这里go!s(x:xs)= let(xs',s')= go(s + x)xs in(fx: xs',s')
    - ^我不知道它为什么在这里工作。 (TD)
    go!s [] =([],s)


    main :: IO()
    main = do
    let foo = mapAndSum(^(2 :: Integer))[1 .. 1000000 :: Integer]
    let sum'= foldl'(+)0
    args< - getArgs
    case args of
    [bad] - >让(xs,s)= foo in print(和xs)>>打印s
    [坏?] - > print $ first sum'$ foo
    - ^没有ghc的优化器,这个版本就像内存
    一样有效,因为good版本
    - 优化bad?是不好像坏。为什么? (TD)
    [good1] - >打印$ first'sum'$ foo
    其中第一个'g(x,y)=(g x,y)
    [good2] - >
    (xs,s)的情况foo - > print(sum'xs)>>打印s
    [good3] - > (\(xs,s) - > print(sum'xs)>> print s)$ foo
    _ - >错误对不起,我不明白。


    解决方案

    让我先回答为什么 mapAndSome 可以很好地工作:您看到的是(非常可能)由Philip Wadler在用垃圾收集器修复一些空间泄漏。简短摘要:如果垃圾收集器看到一个形式为 fst x x 的thunk已经被赋值给元组构造函数,例如(y,z),它会将 fst x 替换为 y z



    在你的代码中, s'将会在 go 的结果被评估为一个元组并且在一轮GCing之后,不保留对元组,但将被累积的参数替换。



    现在让我们通过调查核心来看看其他模式。 foo 绑定被编译为:

      foo_r2eT ::([ .Integer],Type.Integer)
    $ b $ foo_r2eT =
    case $ wgo_r2eP mapAndSum1 lvl2_r2eS $ _ b $ _ {(#ww1_s2d7,ww2_s2d8#) - >
    (ww1_s2d7,ww2_s2d8)
    }

    以下是bad case( lvl18_r2fd 是bad):

      case eqString ds_dyA lvl18_r2fd of _ {
    False - > $ wa_s2da new_s_a14o;
    True - >
    案例_ {
    [] - >的ds1_dyB
    case Handle.Text.hPutStr2
    Handle.FD.stdout lvl17_r2fc True _news_a14o
    _ {(#new_s1_X15h,_#) - >
    Handle.Text.hPutStr2
    Handle.FD.stdout lvl16_r2fb True new_s1_X15h
    };
    :ipv_sIs ipv1_sIt - > $ wa_s2da new_s_a14o
    }

    我们可以看到模块级的两个值被打印出来, lvl17_r2fc lvl16_r2fb ,这里是他们的代码:

      lvl17_r2fc :: String 
    [GblId]
    lvl17_r2fc =
    case _ {(xs_Xqp,s_Xq9) - >
    $ w $ cshowsPrec
    0
    (Data.List.sum_sum'xs_Xqp Data.List.genericDrop2)
    ([] @ Char)
    }

    lvl16_r2fb :: String
    [GblId]
    lvl16_r2fb =
    case _ {(xs_apS,s_Xqp) - > b的foo_r2eT
    $ w $ cshowsPrec 0 s_Xqp([] @ Char)
    }

    为什么它们在模块级绑定,而不在表达式内部?这是懒惰提升的效果,另一个优化可以增加共享,因此有时会对空间性能产生不利影响。查看 GHC ticket 719 以了解其他情况。



    所以会发生的是评估 lvl17_r2fc 导致 foo 左边的条目懒洋洋地印着。不幸的是, lvl16_r2fb 仍然存在并保留了完整的元组。因为垃圾收集器(看起来)没有看到这是一个选择器thunk,所以Wadler的优化不会启动。



    相反,这里是good1 aka lvl8_r2f1

      case eqString ds_dyA lvl8_r2f1 of _ {
    False - > $ wa2_s2dI w3_s2cF;
    True - >
    案例_ {
    [] - >的ds1_dyB
    Handle.Text.hPutStr2
    Handle.FD.stdout lvl7_r2f0 True w3_s2cF;
    :ipv_sHg ipv1_sHh - >














    $ b $打印的值是字符串:

      lvl7_r2f0 ::字符串
    [GblId]
    lvl7_r2f0 =
    case foo_r2eT of _ {(x_af6,y_af7) - >
    show_tuple
    (:
    @ ShowS
    (let {
    w2_a2bY [Dmd = Just L] :: Type.Integer

    w2_a2bY =
    $(
    $ ::
    $ w $ cshowsPrec 0 w2_a2bY w3_a2bZ)
    (:
    @ $ Show
    (\\ \\(w2_a2bZ :: String) - >
    $ w $ cshowsPrec 0 y_af7 w2_a2bZ)
    ([] @ ShowS)))
    ([] @ Char)
    }

    正如你所看到的,元组只被拆分一次,并且都使用这两个值。所以没有任何东西指向整个元组,它可以被垃圾收集。类似于good2good3



    现在到bad?:在未优化的情况下,我们得到这段代码:

      case eqString ds_dyA(unpackCString#bad?)
    of _ {
    False - > fail2_dyN realWorld#;
    True - >
    案例_ {
    [] - >的ds1_dyB
    $
    @(Type.Integer,Type.Integer)
    @(IO())
    (System.IO.print
    @(Type.Integer,Type .Integer)$ dShow_rzk
    $
    @([Type.Integer],Type.Integer)
    @(Type.Integer,Type.Integer)
    (Control.Arrow .first
    @( - >)
    Control.Arrow。$ fArrow( - >)
    @ [Type.Integer]
    @ Type.Integer
    @ Type.Integer
    sum'_rzm)
    foo_rzl);
    :ipv_szd ipv1_sze - >

    $ b <$ c $的http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/Control-Arrow.html#line-133rel =noreferrer>执行 c> first
    通过 *** 使用可改变的模式,所以生成垃圾收集器处理好的选择器thunk。



    在优化案例中,事情有点分散,但无论如何,这里是相关的代码(最后一个值是正在打印的代码):

      w_r2f2 :: Type.Integer 

    w_r2f2 =
    case _ {(x_aI1,y_aI2) - > b的foo_r2eT
    lgo_r2eU mapAndSum1 x_aI1
    }

    lvl9_r2f3 :: String - >字符串
    [GblId,Arity = 1]
    lvl9_r2f3 =
    \(w2_a2bZ :: String) - >
    $ w $ cshowsPrec 0 w_r2f2 w2_a2bZ

    w1_r2f4 :: Type.Integer

    w1_r2f4 = _ {(x_aI6,y_aI7)的情况foo_r2eT - > y_aI7}

    lvl10_r2f5 :: String - >字符串
    [GblId,Arity = 1]
    vl10_r2f5 =
    \(w2_a2bZ :: String) - >
    $ w $ cshowsPrec 0 w1_r2f4 w2_a2bZ

    lvl11_r2f6 :: [ShowS]
    [GblId]
    lvl11_r2f6 =

    @ ShowS lvl10_r2f5 ([] @ ShowS)

    lvl12_r2f7 :: [ShowS]
    [GblId]
    lvl12_r2f7 =:@ ShowS lvl9_r2f3 lvl11_r2f6

    lvl13_r2f8 ::显示
    [GblId]
    lvl13_r2f8 = show_tuple lvl12_r2f7

    lvl14_r2f9 ::字符串
    [GblId]
    lvl14_r2f9 = lvl13_r2f8([] @ Char)

    使用第一个已被内联。我们看到两个对 case foo_r2eT 的调用,所以尽管 w1_rf24 看起来像一个选择器,但这很容易发生空间泄漏所以我期望运行时应用Wadler的优化)。也许它不适合静态thunk?事实上,如果是最新的,评论,只会谈论动态分配的选择器thunk。因此,如果你的 foo 不是模块级别的值(或者懒惰的可升级为一个),而是依赖于某些输入, w1_rf24 可能会动态分配,因此有资格享受特殊待遇。但是,代码可能看起来非常不同。


    The mapAndSum function in the code block all the way below combines map and sum (never mind that another sum is applied in the main function, it just serves to make the output compact). The map is computed lazily, while the sum is computed using an accumulating parameter. The idea is that the result of map can be consumed without ever having the complete list in memory, and (only) afterwards the sum is available "for free". The main function indicates that we had a problem with irrefutable patterns when calling mapAndSum. Let me explain this problem.

    According to the Haskell standard, the irrefutable pattern example let (xs, s) = mapAndSum ... in print xs >> print s is translated into

    (\ v ->    print (case v of { (xs, s) -> xs })
            >> print (case v of { (xs, s) -> s }))
    $ mapAndSum ...
    

    And hence, both print calls carry a reference to the whole pair, which leads to the whole list being kept in memory.

    We (my colleague Toni Dietze and I) solved this using an explicit case statement (compare "bad" vs "good2"). BTW, finding this out took us a considerable amount of time..!

    Now, what we do not understand is two-fold:

    • Why does mapAndSum work in the first place? It also uses an irrefutable pattern, so it should also keep the whole list in memory, but it obviously doesn't. And converting the let into a case would make the function behave completely unlazy (to the point that the stack overflows; no pun intended).

      We had a look at the "core" code generated by GHC, but as far as we could interpret it, it actually contained the same translation of let as the one above. So no clue here, and more confusion instead.

    • Why does "bad?" work when optimization is turned off, but not when it is turned on?

    One remark concerning our actual application: we want to achieve stream processing (format conversion) of a large text file while also accumulating some value that is then written to a separate file. As indicated, we were successful, but the two questions remain, and answers may improve our understanding of GHC for upcoming tasks and problems.

    Thank you!

    {-# LANGUAGE BangPatterns #-}
    
    -- Tested with The Glorious Glasgow Haskell Compilation System, version 7.4.2.
    
    module Main where
    
    
    import Control.Arrow (first)
    import Data.List (foldl')
    import System.Environment (getArgs)
    
    
    mapAndSum :: Num a => (a -> b) -> [a] -> ([b], a)
    mapAndSum f = go 0
      where go !s (x : xs) = let (xs', s') = go (s + x) xs in (f x : xs', s')
                           -- ^ I have no idea why it works here. (TD)
            go !s []       = ([], s)
    
    
    main :: IO ()
    main = do
      let foo  = mapAndSum (^ (2 :: Integer)) [1 .. 1000000 :: Integer]
      let sum' = foldl' (+) 0
      args <- getArgs
      case args of
        ["bad" ]  -> let (xs, s) = foo in print (sum xs) >> print s
        ["bad?"]  -> print $ first sum' $ foo
                  -- ^ Without ghc's optimizer, this version is as memory
                  -- efficient as the "good" versions
                  -- With optimization "bad?" is as bad as "bad". Why? (TD)
        ["good1"] -> print $ first' sum' $ foo
                       where first' g (x, y) = (g x, y)
        ["good2"] -> case foo of
                        (xs, s) -> print (sum' xs) >> print s
        ["good3"] -> (\ (xs, s) -> print (sum' xs) >> print s) $ foo
        _ -> error "Sorry, I do not understand."
    

    解决方案

    Let me first answer why mapAndSome can work good at all: What you see is (very likely) the effect of an optimization described by Philip Wadler in "Fixing some space leaks with a garbage collector". Short summary: If the garbage collector sees a thunk of the form fst x and x is already evaluated to the tuple constructor, e.g. (y,z), it will replace fst x by y, possibly freeing up z if it is not referenced anywhere else.

    In your code, the s' will, once the result of go is evaluated to a tuple and after one round of GCing, not keep a reference to the tuple but will be replaced by the accumulated parameter.

    Now let’s look a the other patterns by investigating core. The foo binding is compiled to:

    foo_r2eT :: ([Type.Integer], Type.Integer)
    
    foo_r2eT =
      case $wgo_r2eP mapAndSum1 lvl2_r2eS
      of _ { (# ww1_s2d7, ww2_s2d8 #) ->
      (ww1_s2d7, ww2_s2d8)
      }
    

    Here is the code in the "bad" case (lvl18_r2fd is "bad"):

    case eqString ds_dyA lvl18_r2fd of _ {
      False -> $wa_s2da new_s_a14o;
      True ->
        case ds1_dyB of _ {
          [] ->
            case Handle.Text.hPutStr2
                   Handle.FD.stdout lvl17_r2fc True new_s_a14o
            of _ { (# new_s1_X15h, _ #) ->
            Handle.Text.hPutStr2
              Handle.FD.stdout lvl16_r2fb True new_s1_X15h
            };
          : ipv_sIs ipv1_sIt -> $wa_s2da new_s_a14o
        }
    

    We can see that two values at the module level are printed, lvl17_r2fc and lvl16_r2fb, here is their code:

    lvl17_r2fc :: String
    [GblId]
    lvl17_r2fc =
      case foo_r2eT of _ { (xs_Xqp, s_Xq9) ->
      $w$cshowsPrec
        0
        (Data.List.sum_sum' xs_Xqp Data.List.genericDrop2)
        ([] @ Char)
      }
    
    lvl16_r2fb :: String
    [GblId]
    lvl16_r2fb =
      case foo_r2eT of _ { (xs_apS, s_Xqp) ->
      $w$cshowsPrec 0 s_Xqp ([] @ Char)
      }
    

    Why are they bound at the module level, and not inside the expression? This is an effect of lazy lifting, another optimization that increases sharing and hence sometime has an adverse effect on space performance. See GHC ticket 719 for another occurrence of this effect.

    So what happens is that the evaluation of lvl17_r2fc causes foo to be evaluated, and the left entry lazily printed. Unfortunately, lvl16_r2fb is still live and retains the full tuple. And because the garbage collector (seems) to fail to see that this is an selector thunk, Wadler’s optimization does not kick in.

    In contrast, here is the code for "good1" a.k.a. lvl8_r2f1:

                case eqString ds_dyA lvl8_r2f1 of _ {
                  False -> $wa2_s2dI w3_s2cF;
                  True ->
                    case ds1_dyB of _ {
                      [] ->
                        Handle.Text.hPutStr2
                          Handle.FD.stdout lvl7_r2f0 True w3_s2cF;
                      : ipv_sHg ipv1_sHh -> $wa2_s2dI w3_s2cF
                    }
                } } in
    

    where the printed value is this string:

    lvl7_r2f0 :: String
    [GblId]
    lvl7_r2f0 =
      case foo_r2eT of _ { (x_af6, y_af7) ->
      show_tuple
        (:
           @ ShowS
           (let {
              w2_a2bY [Dmd=Just L] :: Type.Integer
    
              w2_a2bY = lgo_r2eU mapAndSum1 x_af6 } in
            \ (w3_a2bZ :: String) ->
              $w$cshowsPrec 0 w2_a2bY w3_a2bZ)
           (:
              @ ShowS
              (\ (w2_a2bZ :: String) ->
                 $w$cshowsPrec 0 y_af7 w2_a2bZ)
              ([] @ ShowS)))
        ([] @ Char)
      }
    

    As you can see the tuple is taken apart only once, and both values are used. So nothing refers to the tuple as a whole and it can be garbage collected. Similar for "good2" and "good3".

    Now to "bad?": In the unoptimized case, we get this code:

     case eqString ds_dyA (unpackCString# "bad?")
                     of _ {
                       False -> fail2_dyN realWorld#;
                       True ->
                         case ds1_dyB of _ {
                           [] ->
                             $
                               @ (Type.Integer, Type.Integer)
                               @ (IO ())
                               (System.IO.print
                                  @ (Type.Integer, Type.Integer) $dShow_rzk)
                               ($
                                  @ ([Type.Integer], Type.Integer)
                                  @ (Type.Integer, Type.Integer)
                                  (Control.Arrow.first
                                     @ (->)
                                     Control.Arrow.$fArrow(->)
                                     @ [Type.Integer]
                                     @ Type.Integer
                                     @ Type.Integer
                                     sum'_rzm)
                                  foo_rzl);
                           : ipv_szd ipv1_sze -> fail2_dyN realWorld#
                         }
                     } } in
    

    The implementation of first via *** uses refutable patterns, so the kind of selector thunks that the garbage collector handles well are generated.

    In the optimized case, things are a bit scattered, but anyways here is the relevant code (the last value is the one that is being printed):

    w_r2f2 :: Type.Integer
    
    w_r2f2 =
      case foo_r2eT of _ { (x_aI1, y_aI2) ->
      lgo_r2eU mapAndSum1 x_aI1
      }
    
    lvl9_r2f3 :: String -> String
    [GblId, Arity=1]
    lvl9_r2f3 =
      \ (w2_a2bZ :: String) ->
        $w$cshowsPrec 0 w_r2f2 w2_a2bZ
    
    w1_r2f4 :: Type.Integer
    
    w1_r2f4 = case foo_r2eT of _ { (x_aI6, y_aI7) -> y_aI7 }
    
    lvl10_r2f5 :: String -> String
    [GblId, Arity=1]
    lvl10_r2f5 =
      \ (w2_a2bZ :: String) ->
        $w$cshowsPrec 0 w1_r2f4 w2_a2bZ
    
    lvl11_r2f6 :: [ShowS]
    [GblId]
    lvl11_r2f6 =
      :
        @ ShowS lvl10_r2f5 ([] @ ShowS)
    
    lvl12_r2f7 :: [ShowS]
    [GblId]
    lvl12_r2f7 = : @ ShowS lvl9_r2f3 lvl11_r2f6
    
    lvl13_r2f8 :: ShowS
    [GblId]
    lvl13_r2f8 = show_tuple lvl12_r2f7
    
    lvl14_r2f9 :: String
    [GblId]
    lvl14_r2f9 = lvl13_r2f8 ([] @ Char)
    

    The use of first has been inlined. We see two calls to case foo_r2eT, so this is prone for a space leak, despite w1_rf24 looking like a selector (so I’d expect the runtime to apply Wadler’s optimization). Maybe it does not work well for static thunks? Indeed the commentary, if up to date, only talks about dynamic allocated selector thunks. So if your foo were not a module-level-value (or rather lazy-liftable into one) but rather dependent on some input, w1_rf24 might be dynamically allocated and hence eligible for the special treatment. But then the code might look very different anyways.

    这篇关于不可辩驳的模式不会在递归中泄漏内存,但为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆