将Haskell Word32 / 64中的IEEE 754浮点数与Haskell Float / Double进行转换 [英] Converting IEEE 754 floating point in Haskell Word32/64 to and from Haskell Float/Double

查看:214
本文介绍了将Haskell Word32 / 64中的IEEE 754浮点数与Haskell Float / Double进行转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Haskell中, base 库和Hackage包提供了几种将二进制IEEE-754浮点数据来自于浮动 Float Double 类型。但是,这些方法的准确性,性能和可移植性还不清楚。

对于一个GHC目标库,旨在跨平台对二进制格式进行序列化,处理IEEE-754浮点数据的最佳方法是什么?

方法



这些是我的方法在现有的图书馆和在线资源中遇到。

FFI封送



这是 data-binary-ieee754 包。由于 Float Double , Word32 Word64 是每个可存储的实例,一个可以 poke 来源的值键入外部缓冲区,然后 peek 目标类型的值:

  toFloat ::(F.Storable word,F.Storable float)=>单词 - > float 
toFloat word = F.unsafePerformIO $ F.alloca $ \buf - >做
F.poke(F.castPtr buf)字
F.peek buf



<在我的机器上这是有效的,但是我为了完成强制而看到分配被执行。另外,虽然这个解决方案并不是唯一的,但是这里有一个隐含的假设,即IEEE-754实际上是内存中的表示。这个软件包附带的测试给了它在我的机器上工作的认可,但是这并不理想。

unsafeCoerce

对于内存中的IEEE-754表示,隐含的假设下面的代码也得到了在我的机器上工作的封印:

  toFloat :: Word32  - >浮动
toFloat = unsafeCoerce

这样做的好处是不像上面的方法,但文档说这是你的责任,以确保新旧类型有相同的内部陈述。这种隐含的假设仍然在做所有的工作,并且在处理被解除的类型时更加费劲。

unsafeCoerce#



延伸可能被认为是便携的限制:

  toFloat :: Word  - > Float 
toFloat(W#w)= F#(unsafeCoerce#w)

工作,但似乎并不实际,因为它仅限于 GHC.Exts 类型。



encodeFloat 和<$ c这是很好的绕过提升的类型,但这是所有可以说的。 $ c $> code code code $> c code code code $ c $>在名称上,但似乎没有得到IEEE-754相当正确。一个类似的问题的以前的答案提供了一个简洁的方法,而 ieee754-parser 使用了一个更为通用的方法,而不赞成 data-binary-ieee754



有一些代码不需要关于底层表示的隐含假设,但是这些解决方案依赖于 encodeFloat decodeFloat ,这显然是充满矛盾 。我还没有找到解决这些问题的方法。

解决方案

Simon Marlow在 GC bug 2209 (同样来自Bryan O'Sullivan的回答)


您可以使用castSTUArray实现所需的效果,顺便说一下
(这是我们在GHC中执行的方式)。

为了避免FFI编组方法所需的 unsafePerformIO ,我在一些库中使用了这个选项。

  { - #LANGUAGE FlexibleContexts# - } 

导入Data.Word(Word32,Word64)
导入Data.Array .ST(newArray,castSTUArray,readArray,MArray,STUArray)
import GHC.ST(runST,ST)

wordToFloat :: Word32 - > Float
wordToFloat x = runST(cast x)

floatToWord :: Float - > Word32
floatToWord x = runST(cast x)

wordToDouble :: Word64 - > Double
wordToDouble x = runST(cast x)

doubleToWord :: Double - > Word64
doubleToWord x = runST(cast x)

{ - #INLINE cast# - }
cast ::(MArray(STUArray s)a(ST s),
MArray(STUArray s)b(ST s))=> a - > ST sb
cast x = newArray(0 :: Int,0)x>> = castSTUArray>> = flip readArray 0

我内嵌了演员功能,因为这样做会导致GHC产生更紧密的核心。内联后,将 wordToFloat 转换为 runSTRep 和三个 newByteArray# writeWord32Array# readFloatArray#)。



我不确定与FFI编组方法相比,我比较了两个选项生成的核心

做FFI编组是在这方面公平一点。它调用不安全DuPblePerformIO 和7个primops( noDuplicate# newAlignedPinnedByteArray# unsafeFreezeByteArray# byteArrayContents# writeWord32OffAddr# readFloatOffAddr#触摸#)。



我刚刚开始学习如何分析核心,也许有更多经验的人可以评论成本这些操作?

Question

In Haskell, the base libraries and Hackage packages provide several means of converting binary IEEE-754 floating point data to and from the lifted Float and Double types. However, the accuracy, performance, and portability of these methods are unclear.

For a GHC-targeted library intended to (de)serialize a binary format across platforms, what is the best approach for handling IEEE-754 floating point data?

Approaches

These are the methods I've encountered in existing libraries and online resources.

FFI Marshaling

This is the approach used by the data-binary-ieee754 package. Since Float, Double, Word32 and Word64 are each instances of Storable, one can poke a value of the source type into an external buffer, and then peek a value of the target type:

toFloat :: (F.Storable word, F.Storable float) => word -> float
toFloat word = F.unsafePerformIO $ F.alloca $ \buf -> do
    F.poke (F.castPtr buf) word
    F.peek buf

On my machine this works, but I cringe to see allocation being performed just to accomplish the coercion. Also, although not unique to this solution, there's an implicit assumption here that IEEE-754 is actually the in-memory representation. The tests accompanying the package give it the "works on my machine" seal of approval, but this is not ideal.

unsafeCoerce

With the same implicit assumption of in-memory IEEE-754 representation, the following code gets the "works on my machine" seal as well:

toFloat :: Word32 -> Float
toFloat = unsafeCoerce

This has the benefit of not performing explicit allocation like the approach above, but the documentation says "it is your responsibility to ensure that the old and new types have identical internal representations". That implicit assumption is still doing all the work, and is even more strenuous when dealing with lifted types.

unsafeCoerce#

Stretching the limits of what might be considered "portable":

toFloat :: Word -> Float
toFloat (W# w) = F# (unsafeCoerce# w)

This seems to work, but doesn't seem practical at all since it's limited to the GHC.Exts types. It's nice to bypass the lifted types, but that's about all that can be said.

encodeFloat and decodeFloat

This approach has the nice property of bypassing anything with unsafe in the name, but doesn't seem to get IEEE-754 quite right. A previous SO answer to a similar question offers a concise approach, and the ieee754-parser package used a more general approach before being deprecated in favor of data-binary-ieee754.

There's quite a bit of appeal to having code that needs no implicit assumptions about underlying representation, but these solutions rely on encodeFloat and decodeFloat, which are apparently fraught with inconsistencies. I've not yet found a way around these problems.

解决方案

Simon Marlow mentions another approach in GHC bug 2209 (also linked to from Bryan O'Sullivan's answer)

You can achieve the desired effect using castSTUArray, incidentally (this is the way we do it in GHC).

I've used this option in some of my libraries in order to avoid the unsafePerformIO required for the FFI marshalling method.

{-# LANGUAGE FlexibleContexts #-}

import Data.Word (Word32, Word64)
import Data.Array.ST (newArray, castSTUArray, readArray, MArray, STUArray)
import GHC.ST (runST, ST)

wordToFloat :: Word32 -> Float
wordToFloat x = runST (cast x)

floatToWord :: Float -> Word32
floatToWord x = runST (cast x)

wordToDouble :: Word64 -> Double
wordToDouble x = runST (cast x)

doubleToWord :: Double -> Word64
doubleToWord x = runST (cast x)

{-# INLINE cast #-}
cast :: (MArray (STUArray s) a (ST s),
         MArray (STUArray s) b (ST s)) => a -> ST s b
cast x = newArray (0 :: Int, 0) x >>= castSTUArray >>= flip readArray 0

I inlined the cast function because doing so causes GHC to generate much tighter core. After inlining, wordToFloat is translated to a call to runSTRep and three primops (newByteArray#, writeWord32Array#, readFloatArray#).

I'm not sure what performance is like compared to the FFI marshalling method, but just for fun I compared the core generated by both options.

Doing FFI marshalling is a fair bit more complicated in this regard. It calls unsafeDupablePerformIO and 7 primops (noDuplicate#, newAlignedPinnedByteArray#, unsafeFreezeByteArray#, byteArrayContents#, writeWord32OffAddr#, readFloatOffAddr#, touch#).

I've only just started learning how to analyse core, perhaps someone with more experience can comment on the cost of these operations?

这篇关于将Haskell Word32 / 64中的IEEE 754浮点数与Haskell Float / Double进行转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆