为什么这些定点cata/ana态射定义比递归定义好? [英] Why do these fixpoint cata / ana morphism definitions outperform the recursive ones?

查看:95
本文介绍了为什么这些定点cata/ana态射定义比递归定义好?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

上一个问题中考虑以下定义:

type Algebra f a = f a -> a

cata :: Functor f => Algebra f b -> Fix f -> b
cata alg = alg . fmap (cata alg) . unFix

fixcata :: Functor f => Algebra f b -> Fix f -> b
fixcata alg = fix $ \f -> alg . fmap f . unFix

type CoAlgebra f a = a -> f a

ana :: Functor f => CoAlgebra f a -> a -> Fix f
ana coalg = Fix . fmap (ana coalg) . coalg

fixana :: Functor f => CoAlgebra f a -> a -> Fix f
fixana coalg = fix $ \f -> Fix . fmap f . coalg

我运行了一些基准测试,结果令我惊讶. criterion 报告的速度提高了十倍,特别是在启用 O2 的情况下.我不知道是什么原因导致了如此巨大的进步,并开始严重怀疑我的基准测试能力.

I ran some benchmarks and the results are surprising me. criterion reports something like a tenfold speedup, specifically when O2 is enabled. I wonder what causes such massive improvement, and begin to seriously doubt my benchmarking abilities.

这是我使用的确切的条件代码:

This is the exact criterion code I use:

smallWord, largeWord :: Word
smallWord = 2^10
largeWord = 2^20

shortEnv, longEnv :: Fix Maybe
shortEnv = ana coAlg smallWord
longEnv = ana coAlg largeWord

benchCata = nf (cata alg)
benchFixcata = nf (fixcata alg)

benchAna = nf (ana coAlg)
benchFixana = nf (fixana coAlg)

main = defaultMain
    [ bgroup "cata"
        [ bgroup "short input"
            [ env (return shortEnv) $ \x -> bench "cata"    (benchCata x)
            , env (return shortEnv) $ \x -> bench "fixcata" (benchFixcata x)
            ]
        , bgroup "long input"
            [ env (return longEnv) $ \x -> bench "cata"    (benchCata x)
            , env (return longEnv) $ \x -> bench "fixcata" (benchFixcata x)
            ]
        ]
    , bgroup "ana"
        [ bgroup "small word"
            [ bench "ana" $ benchAna smallWord
            , bench "fixana" $ benchFixana smallWord
            ]
        , bgroup "large word"
            [ bench "ana" $ benchAna largeWord
            , bench "fixana" $ benchFixana largeWord
            ]
        ]
    ]

以及一些辅助代码:

alg :: Algebra Maybe Word
alg Nothing = 0
alg (Just x) = succ x

coAlg :: CoAlgebra Maybe Word
coAlg 0 = Nothing
coAlg x = Just (pred x)

O0 编译时,数字很均匀.使用 O2 fix〜函数的性能似乎要优于普通函数:

Compiled with O0, the digits are pretty even. With O2, fix~ functions seem to outperform the plain ones:

benchmarking cata/short input/cata
time                 31.67 μs   (31.10 μs .. 32.26 μs)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 31.20 μs   (31.05 μs .. 31.46 μs)
std dev              633.9 ns   (385.3 ns .. 1.029 μs)
variance introduced by outliers: 18% (moderately inflated)

benchmarking cata/short input/fixcata
time                 2.422 μs   (2.407 μs .. 2.440 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.399 μs   (2.388 μs .. 2.410 μs)
std dev              37.12 ns   (31.44 ns .. 47.06 ns)
variance introduced by outliers: 14% (moderately inflated)

如果有人可以确认或发现缺陷,我将不胜感激.

I would appreciate if someone can confirm or spot a flaw.

*我这次用 ghc 8.2.2 进行了编译.)

*I compiled things with ghc 8.2.2 on this occasion.)

后记

这篇早在2012年的帖子详细说明了 fix 的性能细节.(感谢 @chi 的链接.)

This post from back in 2012 elaborates on the performance of fix in quite a fine detail. (Thanks to @chi for the link.)

推荐答案

这是由于如何通过修复 .上面的@duplode(以及我自己在相关问题中指出)).无论如何,我们可以将问题总结如下.

This is due to how the fixed point is computed by fix. This was pointed out by @duplode above (and by myself in a related question). Anyway, we can summarize the issue as follows.

我们有

fix f = f (fix f)

可行,但是在每次递归时都会进行 fix f 新调用.相反,

works, but makes a fix f new call at every recursion. Instead,

fix f = go
   where go = f go

计算避免该调用的相同固定点.在库中, fix 以这种更有效的方式实现.

computes the same fixed point avoiding that call. In the libraries fix is implemented in this more efficient way.

回到问题所在,考虑以下 cata 的以下三个实现:

Back to the question, consider the following three implementations of cata:

cata :: Functor f => Algebra f b -> Fix f -> b
cata alg' = alg' . fmap (cata alg') . unFix

cata2 :: Functor f => Algebra f b -> Fix f -> b
cata2 alg' = go
   where
   go = alg' . fmap go . unFix

fixcata :: Functor f => Algebra f b -> Fix f -> b
fixcata alg' = fix $ \f -> alg' . fmap f . unFix

第一个在每次递归时调用 cata alg'.第二个没有.第三个也不是,因为库 fix 是有效的.

The first one makes a call cata alg' at every recursion. The second one does not. The third one also does not since the library fix is efficient.

事实上,即使使用OP所使用的相同测试,我们也可以使用Criterion进行确认:

And indeed, we can use Criterion to confirm this, even using the same test used by the OP:

benchmarking cata/short input/cata
time                 16.58 us   (16.54 us .. 16.62 us)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 16.62 us   (16.58 us .. 16.65 us)
std dev              111.6 ns   (89.76 ns .. 144.0 ns)

benchmarking cata/short input/cata2
time                 1.746 us   (1.742 us .. 1.749 us)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.741 us   (1.736 us .. 1.744 us)
std dev              12.69 ns   (10.50 ns .. 17.31 ns)

benchmarking cata/short input/fixcata
time                 2.010 us   (2.003 us .. 2.016 us)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.006 us   (2.001 us .. 2.011 us)
std dev              16.40 ns   (14.05 ns .. 19.27 ns)

大量输入也显示出改进.

Long inputs also show the improvement.

benchmarking cata/long input/cata
time                 119.3 ms   (113.4 ms .. 125.8 ms)
                     0.996 R²   (0.992 R² .. 1.000 R²)
mean                 119.8 ms   (117.7 ms .. 121.7 ms)
std dev              2.924 ms   (2.073 ms .. 4.064 ms)
variance introduced by outliers: 11% (moderately inflated)

benchmarking cata/long input/cata2
time                 17.89 ms   (17.43 ms .. 18.36 ms)
                     0.996 R²   (0.992 R² .. 0.999 R²)
mean                 18.02 ms   (17.49 ms .. 18.62 ms)
std dev              1.362 ms   (853.9 us .. 2.022 ms)
variance introduced by outliers: 33% (moderately inflated)

benchmarking cata/long input/fixcata
time                 18.03 ms   (17.56 ms .. 18.50 ms)
                     0.996 R²   (0.992 R² .. 0.999 R²)
mean                 18.17 ms   (17.57 ms .. 18.72 ms)
std dev              1.365 ms   (852.1 us .. 2.045 ms)
variance introduced by outliers: 33% (moderately inflated)

我还尝试了 ana ,观察到类似改进的 ana2 的性能与 fixana 一致.那里也没有惊喜.

I also experimented with ana, observing that the performance of a similarly improved ana2 agrees with fixana. No surprises there as well.

这篇关于为什么这些定点cata/ana态射定义比递归定义好?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆