如何找出GHC的数据类型的内存表示? [英] How to find out GHC's memory representations of data types?
问题描述
最近,博客文章如计算哈希映射的大小解释了如何推理常用容器类型的空间复杂性。现在我面临的问题是如何实际查看我GHC版本选择哪种内存布局(取决于编译标志和目标体系结构)奇怪的数据类型(构造函数),比如
data BitVec257 = BitVec257 { - #UNPACK# - }!Word64
{ - #UNPACK# - }!Word64
{ - #UNPACK# - } !Bool
{ - #UNPACK# - }!Word64
{ - #UNPACK# - }!Word64
data BitVec514 = BitVec514 { - #UNPACK# - }!BitVec257
{ - #UNPACK# - }!BitVec257
在C中有 sizeof
和 offsetof
运算符,它允许我查看C 字段的大小和对齐方式。 struct
。
我试图看看GHC Core,希望在那里找到一些提示,但我不知道该怎么做寻找。有人能指点我的方向吗?
由于Simon Marlow:<$ p $ ghcmutterings.wordpress.com/2009/02/12/53/rel =noreferrer> neat litte function
$ b $ import $ $
$ import $
$ b $ {$#$ c $ {$#$ L $ b unsafeSizeof :: a - > Int
unsafeSizeof a =
case unpackClosure#a of
(#x,ptrs,nptrs#) - >
sizeOf(undefined :: Int)+ - 一个单词用于标题
I#(sizeofByteArray#(unsafeCoerce#ptrs)
+#sizeofByteArray#nptrs)
$ b
使用它:
Prelude> ; :!ghc -c Size.hs
Size.hs:15:18:
警告:忽略'BitVec257'
第三个参数中不可用的UNPACK杂注
在数据构造函数`BitVec257'的定义中
在`BitVec257'的数据类型声明中
Prelude Size> unsafeSizeof $!! BitVec514(BitVec257 1 2 True 3 4)(BitVec257 1 2 True 3 4)
74
(请注意,GHC告诉你,由于它是一种总和类型,它不能取消装箱 Bool
。)
以上函数声称您的数据类型在64位机器上使用74个字节。我觉得很难相信。我希望数据类型使用11个字= 88个字节,每个字段一个字。即使 Bool
也需要一个词,因为它们是指向(静态分配的)构造函数的指针。我不太确定这里发生了什么。
至于对齐,我相信每个字段都应该是字对齐的。
Recently, blog entries such as Computing the Size of a Hashmap explained how to reason about space complexities of commonly used container types. Now I'm facing the question of how to actually "see" which memory layout my GHC version chooses (depending on compile flags and target architecture) for weird data types (constructors) such as
data BitVec257 = BitVec257 {-# UNPACK #-} !Word64
{-# UNPACK #-} !Word64
{-# UNPACK #-} !Bool
{-# UNPACK #-} !Word64
{-# UNPACK #-} !Word64
data BitVec514 = BitVec514 {-# UNPACK #-} !BitVec257
{-# UNPACK #-} !BitVec257
In C there's the sizeof
and offsetof
operator, which allows me to "see" what size and alignment was chosen for the fields of C struct
.
I've tried to look at GHC Core in the hope to find some hint there, but I didn't know what to look for. Can somebody point me in the right direction?
My first idea was to use this neat litte function, due to Simon Marlow:
{-# LANGUAGE MagicHash,UnboxedTuples #-}
module Size where
import GHC.Exts
import Foreign
unsafeSizeof :: a -> Int
unsafeSizeof a =
case unpackClosure# a of
(# x, ptrs, nptrs #) ->
sizeOf (undefined::Int) + -- one word for the header
I# (sizeofByteArray# (unsafeCoerce# ptrs)
+# sizeofByteArray# nptrs)
Using it:
Prelude> :!ghc -c Size.hs
Size.hs:15:18:
Warning: Ignoring unusable UNPACK pragma on the
third argument of `BitVec257'
In the definition of data constructor `BitVec257'
In the data type declaration for `BitVec257'
Prelude Size> unsafeSizeof $! BitVec514 (BitVec257 1 2 True 3 4) (BitVec257 1 2 True 3 4)
74
(Note that GHC is telling you that it cannot unbox Bool
since it's a sum type.)
The above function claims that your data type uses 74 bytes on a 64-bit machine. I find that hard to believe. I'd expect the data type to use 11 words = 88 bytes, one word per field. Even Bool
s take one word, as they are pointer to (statically allocated) constructors. I'm not quite sure what's going on here.
As for alignment I believe every field should be word aligned.
这篇关于如何找出GHC的数据类型的内存表示?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!