在标准基准中防止计算缓存 [英] Preventing caching of computation in Criterion benchmark

查看:114
本文介绍了在标准基准中防止计算缓存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下代码(由Reid Barton在导致内存消耗爆炸的标准,未见CAF的标准)$ b使用编译时,$ b的基准时间与 num
成比例地缩放 > O0
优化。然而,使用 O3 优化似乎会导致基准时间为 独立 num 。核心部分是缓存的结果,我可以做些什么来防止缓存?



代码是:

  { - #OPTIONS_GHC -fno-cse# - } 
{ - #LANGUAGE BangPatterns# - }
模块Main其中
导入Criterion.Main
import Data.List

num :: Int
num = 100000000

lst :: a - > [Int]
lst _ = [1,2..num]

myadd :: Int - > Int - > Int
myadd!x!y = let!result = x + y in
result

mysum = foldl'myadd 0

main :: IO ()
main = defaultMain [
bgroupsummation
[bench'mysum'$ whnf(mysum。lst)()]
]


$ b

核心是:

pre $ main7
main7 = unpackCString#mysum#

main8
main8 = unpackCString#summation#

Rec {
$ wlgo
$ wlgo =
\ ww_s6vW w_s6vT - >
案例w_s6vT of _ {
[] - > ww_s6vW;
:x_a4dz xs_a4dA - >
case __4dz of _ {I#ipv_s4d4 - >
$ wlgo(+#ww_s6vW ipv_s4d4)xs_a4dA
}
}
end Rec}

lst1
lst1 = efdtInt 1 2 100000000

lvl_r6yu
lvl_r6yu = case $ wlgo 0 lst1 of ww_s6w5 {__DEFAULT - > I#ww_s6w5}

Rec {
main_ $ s $ wa
main_ $ s $ wa =
\ sc_s6xB sc1_s6xC sc2_s6xD - >
case tagToEnum#(<=#sc1_s6xC 0)of _ {
False - >
case seq#lvl_r6yu sc2_s6xD of _ {(#ipv_a4BO,ipv1_a4BP#) - >
main_ $ s $ wa sc_s6xB( - #sc1_s6xC 1)ipv_a4BO
};
True - > (#sc2_s6xD,()#)
}
end Rec}

main6
main6 =
\ w_s6w9 w1_s6wa - >
case w_s6w9 of _ {I64#ww1_s6wd - >
main_ $ s $ wa()ww1_s6wd w1_s6wa
}

main5
main5 =基准main7(main6`cast` ...)

main4
main4 =:main5([])

main3
main3 = BenchGroup main8 main4
$ b $ main2
main2 =:main3 ([])

main1
main1 = \ eta_B1 - > defaultMain2 defaultConfig main2 eta_B1

main9
main9 = \ eta_B1 - > runMainIO1(main1`cast` ...)eta_B1

lst
lst = \ @ a_a40V _ - > lst1

main
main = main1`cast` ...

myadd
myadd =
\ x_a3Io y_a3Ip - >
案例x_a3I of _ {I#ipv_s4d1 - >
案例y_a3Ip of _ {I#ipv1_s4d4 - > I#(+#ipv_s4d1 ipv1_s4d4)}
}

mysum
mysum =
\ w_s6w2 - >
case $ wlgo 0 ww_s6w5的w_s6w2 {__DEFAULT - > I#ww_s6w5}

num
num = I#100000000

main
main = main9`cast` ...

其中我标记了 -ddump-simpl -fforce-recomp -O3 -dsuppress-all ghc结尾--make -no-link ... 命令通过来调用
> cabal构建
。我正在使用条件1.1.0.0和GHC版本7.8.3。

解决方案

结果被缓存在 lvl_r6yu 。你可以看到 lst1 [0..num] 被提升到顶层,从 $ wlgo 0 lst1 可以看出求和的结果也被解除了。

如果我们添加顶层定义 foo = mysum,则更容易看到发生了什么。 lst ,然后查看 foo 的核心。你可以看到 foo 是一个返回求和结果的常量函数。



如果我们添加 { - #OPTIONS -fno-full-laziness# - } ,那么子表达式将不会取消,因此基准将按预期工作。

通常使用 criterion 通过提供给<$ c $的参数来控制评估是个好主意C> whnf 。在我们的例子中:

  benchmysum$ whnf(\size  - > mysum [1..size])num 

无论优化还是提升,这都可以正常工作。

The following code (suggested by Reid Barton at Criterion causing memory consumption to explode, no CAFs in sight) has a benchmark time which scales proportionally with num when compiled with O0 optimization. However using O3 optimization seems to result in a benchmark time which is independent of num. Where in the core is the result being cached, and what can I do to prevent it from being cached?

The code is :

{-# OPTIONS_GHC -fno-cse #-}
{-# LANGUAGE BangPatterns #-}
module Main where
import Criterion.Main
import Data.List

num :: Int
num = 100000000

lst :: a -> [Int]
lst _ = [1,2..num]

myadd :: Int -> Int -> Int
myadd !x !y = let !result = x + y in
  result

mysum = foldl' myadd 0

main :: IO ()
main = defaultMain [
  bgroup "summation" 
    [bench "mysum" $ whnf (mysum . lst) ()]
  ]

and the core is :

main7
main7 = unpackCString# "mysum"#

main8
main8 = unpackCString# "summation"#

Rec {
$wlgo
$wlgo =
  \ ww_s6vW w_s6vT ->
    case w_s6vT of _ {
      [] -> ww_s6vW;
      : x_a4dz xs_a4dA ->
    case x_a4dz of _ { I# ipv_s4d4 ->
    $wlgo (+# ww_s6vW ipv_s4d4) xs_a4dA
    }
    }
end Rec }

lst1
lst1 = efdtInt 1 2 100000000

lvl_r6yu
lvl_r6yu = case $wlgo 0 lst1 of ww_s6w5 { __DEFAULT -> I# ww_s6w5 }

Rec {
main_$s$wa
main_$s$wa =
  \ sc_s6xB sc1_s6xC sc2_s6xD ->
    case tagToEnum# (<=# sc1_s6xC 0) of _ {
      False ->
    case seq# lvl_r6yu sc2_s6xD of _ { (# ipv_a4BO, ipv1_a4BP #) ->
    main_$s$wa sc_s6xB (-# sc1_s6xC 1) ipv_a4BO
    };
      True -> (# sc2_s6xD, () #)
    }
end Rec }

main6
main6 =
  \ w_s6w9 w1_s6wa ->
    case w_s6w9 of _ { I64# ww1_s6wd ->
    main_$s$wa () ww1_s6wd w1_s6wa
    }

main5
main5 = Benchmark main7 (main6 `cast` ...)

main4
main4 = : main5 ([])

main3
main3 = BenchGroup main8 main4

main2
main2 = : main3 ([])

main1
main1 = \ eta_B1 -> defaultMain2 defaultConfig main2 eta_B1

main9
main9 = \ eta_B1 -> runMainIO1 (main1 `cast` ...) eta_B1

lst
lst = \ @ a_a40V _ -> lst1

main
main = main1 `cast` ...

myadd
myadd =
  \ x_a3Io y_a3Ip ->
    case x_a3Io of _ { I# ipv_s4d1 ->
    case y_a3Ip of _ { I# ipv1_s4d4 -> I# (+# ipv_s4d1 ipv1_s4d4) }
    }

mysum
mysum =
  \ w_s6w2 ->
    case $wlgo 0 w_s6w2 of ww_s6w5 { __DEFAULT -> I# ww_s6w5 }

num
num = I# 100000000

main
main = main9 `cast` ...

where I tagged -ddump-simpl -fforce-recomp -O3 -dsuppress-all to the end of the ghc --make -no-link ... command invoked by cabal build. I am using criterion 1.1.0.0 and GHC version 7.8.3.

解决方案

The result is being cached in your lvl_r6yu. You can see that lst1 is [0..num] lifted out to the top level, and from $wlgo 0 lst1 it can be seen that the result of the summation is lifted out too.

It's easier to see what's happening if we add the top level definition foo = mysum . lst, and then look at the core for foo. You can see there that foo is a constant function returning the result of the summation.

If we add {-# OPTIONS -fno-full-laziness #-}, then subexpressions will not be lifted, and therefore the benchmark will work as intended.

It is a good idea in general when using criterion to control evaluation through the arguments supplied to whnf. In our case:

bench "mysum" $ whnf (\size -> mysum [1..size]) num

This works fine regardless of optimization or lifting.

这篇关于在标准基准中防止计算缓存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆