Euler 问题和 Int64 类型递归的性能问题 [英] Performance problem with Euler problem and recursion on Int64 types

查看:28
本文介绍了Euler 问题和 Int64 类型递归的性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用 Euler 问题项目作为我的游乐场来学习 Haskell.我惊讶于我的 Haskell 程序与类似的程序相比有多慢用其他语言编写的程序.我想知道我是否已经预见到了什么,或者这是否是人们在使用 Haskell 时必须期望的性能损失.

I'm currently learning Haskell using the project Euler problems as my playground. I was astound by how slow my Haskell programs turned out to be compared to similar programs written in other languages. I'm wondering if I've forseen something, or if this is the kind of performance penalties one has to expect when using Haskell.

以下程序的灵感来自问题 331,但我在发布之前对其进行了更改,因此我不会对其他人造成任何破坏.它计算在 2^30 x 2^30 网格上绘制的离散圆的弧长.这是一个简单的尾递归实现,我确保跟踪弧长的累积变量的更新是严格的.然而,它几乎需要一分半钟才能完成(使用 -O 标志和 ghc 编译).

The following program in inspired by Problem 331, but I've changed it before posting so I don't spoil anything for other people. It computes the arc length of a discrete circle drawn on a 2^30 x 2^30 grid. It is a simple tail recursive implementation and I make sure that the updates of the accumulation variable keeping track of the arc length is strict. Yet it takes almost one and a half minute to complete (compiled with the -O flag with ghc).

import Data.Int

arcLength :: Int64->Int64
arcLength n = arcLength' 0 (n-1) 0 0 where
    arcLength' x y norm2 acc
        | x > y = acc
        | norm2 < 0 = arcLength' (x + 1) y (norm2 + 2*x +1) acc
        | norm2 > 2*(n-1) = arcLength' (x - 1) (y-1) (norm2 - 2*(x + y) + 2) acc
        | otherwise = arcLength' (x + 1) y (norm2 + 2*x + 1) $! (acc + 1)

main = print $ arcLength (2^30)

这里是对应的Java实现.大约需要 4.5 秒才能完成.

Here is a corresponding implementation in Java. It takes about 4.5 seconds to complete.

public class ArcLength {
public static void main(String args[]) {
    long n = 1 << 30;
    long x = 0;
    long y = n-1;
    long acc = 0;
    long norm2 = 0;
    long time = System.currentTimeMillis();

    while(x <= y) {
        if (norm2 < 0) {
            norm2 += 2*x + 1;
            x++;
        } else if (norm2 > 2*(n-1)) {
            norm2 += 2 - 2*(x+y);
            x--;
            y--;
        } else {
            norm2 += 2*x + 1;
            x++;
            acc++;
        }
    }

    time = System.currentTimeMillis() - time;
    System.err.println(acc);
    System.err.println(time);
}

}

在评论中的讨论之后,我对 Haskell 代码进行了一些修改并进行了一些性能测试.首先,我将 n 更改为 2^29 以避免溢出.然后我尝试了 6 个不同的版本:在声明 arcLength' x y !norm2 !acc 中使用 Int64 或 Int 并在 norm2 或两者之前使用刘海和 norm2 和 acc.都是用

After the discussions in the comments I made som modifications in the Haskell code and did some performance tests. First I changed n to 2^29 to avoid overflows. Then I tried 6 different version: With Int64 or Int and with bangs before either norm2 or both and norm2 and acc in the declaration arcLength' x y !norm2 !acc. All are compiled with

ghc -O3 -prof -rtsopts -fforce-recomp -XBangPatterns arctest.hs

结果如下:

(Int !norm2 !acc)
total time  =        3.00 secs   (150 ticks @ 20 ms)
total alloc =       2,892 bytes  (excludes profiling overheads)

(Int norm2 !acc)
total time  =        3.56 secs   (178 ticks @ 20 ms)
total alloc =       2,892 bytes  (excludes profiling overheads)

(Int norm2 acc)
total time  =        3.56 secs   (178 ticks @ 20 ms)
total alloc =       2,892 bytes  (excludes profiling overheads)

(Int64 norm2 acc)
arctest.exe: out of memory

(Int64 norm2 !acc)
total time  =       48.46 secs   (2423 ticks @ 20 ms)
total alloc = 26,246,173,228 bytes  (excludes profiling overheads)

(Int64 !norm2 !acc)
total time  =       31.46 secs   (1573 ticks @ 20 ms)
total alloc =       3,032 bytes  (excludes profiling overheads)

我在 64 位 Windows 7(Haskell 平台二进制分发版)下使用 GHC 7.0.2.根据评论,在其他配置下编译没有出现这个问题.这让我觉得 Int64 类型在 Windows 版本中被破坏了.

I'm using GHC 7.0.2 under a 64-bit Windows 7 (The Haskell platform binary distribution). According to the comments, the problem does not occur when compiling under other configurations. This makes me think that the Int64 type is broken in the Windows release.

推荐答案

嗯,我用 7.0.3 安装了一个全新的 Haskell 平台,并为您的程序大致获得以下核心(-ddump-simpl):

Hm, I installed a fresh Haskell platform with 7.0.3, and get roughly the following core for your program (-ddump-simpl):

Main.$warcLength' =
   (ww_s1my :: GHC.Prim.Int64#) (ww1_s1mC :: GHC.Prim.Int64#)
    (ww2_s1mG :: GHC.Prim.Int64#) (ww3_s1mK :: GHC.Prim.Int64#) ->
    case {__pkg_ccall ghc-prim hs_gtInt64 [...]
           ww_s1my ww1_s1mC GHC.Prim.realWorld#
[...]

所以 GHC 意识到它可以解包你的整数,这很好.但是这个 hs_getInt64 调用看起来很像 C 调用.查看汇编器输出 (-ddump-asm),我们看到如下内容:

So GHC has realized that it can unpack your integers, which is good. But this hs_getInt64 call looks suspiciously like a C call. Looking at the assembler output (-ddump-asm), we see stuff like:

pushl %eax
movl 76(%esp),%eax
pushl %eax
call _hs_gtInt64
addl $16,%esp

所以这看起来很像 Int64 上的每个操作都在后端变成了一个完整的 C 调用.很明显,.

So this looks very much like every operation on the Int64 get turned into a full-blown C call in the backend. Which is slow, obviously.

GHC.IntWord64 的 >source code 似乎证实了这一点:在 32 位版本中(如当前随平台提供的版本),您将只能通过 FFI 接口进行仿真.

The source code of GHC.IntWord64 seems to verify that: In a 32-bit build (like the one currently shipped with the platform), you will have only emulation via the FFI interface.

这篇关于Euler 问题和 Int64 类型递归的性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆