获得GHC生产“添加进位(ADC)”说明 [英] Getting GHC to produce "Add With Carry (ADC)" instructions

查看:139
本文介绍了获得GHC生产“添加进位(ADC)”说明的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是代码,它将两个代表一个192位数字的无盒子词组的三个三元组添加到一个新的无盒子词组中,并返回任何溢出:

  { - #LANGUAGE MagicHash# - } 
{ - #LANGUAGE UnboxedTuples# - }

import GHC.Prim(plusWord2#,Word#或#)

longAdd ::
(#Word#,Word#,Word##) - >
(#Word#,Word#,Word##) - >
(#字符#,(#字#,字#,字##)#)

longAdd(#xl,xm,xh#)(#yl,ym,yh#) =
let
plusWord3 xyc =
let
(#c1,r1#)= plusWord2#xy
(#c2,r2#)= plusWord2#r1 c
in
(#plusWord#c1 c2,r2#)
(#cl,rl#)= plusWord2#xl yl
(#cm,rm#)= plusWord3 xm ym cl
(#ch,rh#)= plusWord3 xh yh cm
in
(#ch,(#rl,rm,rh#)#)

问题在于plusWord3的定义。理想情况下,这就像一个adc函数,它需要两个单词和进位,并返回结果和一个新的进位,因此生成的程序集如下所示:

  add x1 y1 
adc x2 y2
adc x3 y3

不幸的是,GHC无论是本机还是通过LLVM产生丑陋的汇编代码,包括将进位位保存到寄存器,然后通过单独的额外添加来读取,而不是仅使用 ADC 。我不想调用外部C函数来实现这一点,因为一旦你添加了调用开销,它可能不值得,我想留在Haskell中,这样代码就可以在可能的情况下被内联。但我也希望能够让编译器适当地生成 adc 指令。有什么方法可以实现吗?

解决方案

大多数可行且高效的方法都是直接在程序中调用primop。 p>

使用FFI调用是最简单的方法,但您也注意到由于FFI的开销,它不会是最有效的方式。



即使编译器支持您想要的指令并在某些程序中使用它,它也会很脆弱。 所以我的建议是: $ b

所以我的建议是:
$ b


  1. 将所需的指令添加到X86代码生成器后端,如果它尚未存在。
  2. 添加一个primop直接转换为您想要运行的指令。首先确保没有这样的primop存在。然后按照以下步骤操作: https://ghc.haskell.org/trac/ghc/wiki/AddingNewPrimitiveOperations

  3. 您可以在GHC.Prim中看到您的primop( http://hackage.haskell.org/package/ghc-prim/docs/GHC-Prim.html ),在您的程序中使用它。

  4. 添加测试,提交你的补丁:)


Here is code that adds two triples of unboxed Words representing a 192 bit number into a new triple of unboxed Words, and also returns any overflow:

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}

import GHC.Prim(plusWord2#, Word#, or#)

longAdd :: 
  (# Word#, Word#, Word# #) -> 
  (# Word#, Word#, Word# #) -> 
  (# Word#, (# Word#, Word#, Word# #) #)

longAdd (# xl, xm, xh #) (# yl, ym, yh #) =     
  let
    plusWord3 x y c = 
      let 
        (# c1, r1 #) = plusWord2# x y
        (# c2, r2 #) = plusWord2# r1 c
      in
        (# plusWord# c1 c2, r2 #)
    (# cl, rl #) = plusWord2# xl yl
    (# cm, rm #) = plusWord3 xm ym cl
    (# ch, rh #) = plusWord3 xh yh cm     
  in
    (# ch, (# rl, rm, rh #) #)

The issue is the "plusWord3" definition. Ideally, this is just like an "adc" function, which takes two words and the carry bit and returns the result and a new carry, so the resulting assembly is like the following:

add x1 y1
adc x2 y2
adc x3 y3

Unfortunately GHC, whether native or via LLVM, produce ugly assembly code that involves saving the carry bit to a register and then reading it via a separate extra add, instead of just using adc. I don't want to call an external C function to achieve this, as once you add the call overhead it's probably not worth it, I'd like to stay in Haskell so the code can be inlined where possible. But I also want to be able to coax the compiler into producing the adc instruction appropriately. Is there anyway I can achieve that?

解决方案

Most realiable and efficient way would be calling a primop directly in your program.

Using a FFI call is the easiest way but as you also noted it won't be the most efficient way, because of the FFI overheads.

Even if the compiler would support the instruction you want and use it in some programs, it would be fragile. Some seemingly innocent changes in your program may end up with different generated assembly that doesn't use the instruction you want.

So my proposal is:

  1. Add the instruction you need to X86 code generator backend, if it isn't there already.
  2. Add a primop that translates directly to the instruction you want to run. First make sure no such primop exists. Then follow these steps: https://ghc.haskell.org/trac/ghc/wiki/AddingNewPrimitiveOperations
  3. You primop should be visible in GHC.Prim (http://hackage.haskell.org/package/ghc-prim/docs/GHC-Prim.html), use it in your programs.
  4. Add tests, submit your patch :)

这篇关于获得GHC生产“添加进位(ADC)”说明的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆