如何找出GHC的数据类型的内存表示? [英] How to find out GHC's memory representations of data types?

查看:139
本文介绍了如何找出GHC的数据类型的内存表示?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,博客文章如计算哈希映射的大小解释了如何推理常用容器类型的空间复杂性。现在我面临的问题是如何实际查看我GHC版本选择哪种内存布局(取决于编译标志和目标体系结构)奇怪的数据类型(构造函数),比如

  data BitVec257 = BitVec257 { - #UNPACK# - }!Word64 
{ - #UNPACK# - }!Word64
{ - #UNPACK# - } !Bool
{ - #UNPACK# - }!Word64
{ - #UNPACK# - }!Word64

data BitVec514 = BitVec514 { - #UNPACK# - }!BitVec257
{ - #UNPACK# - }!BitVec257

在C中有 sizeof offsetof 运算符,它允许我查看C 字段的大小和对齐方式。 struct



我试图看看GHC Core,希望在那里找到一些提示,但我不知道该怎么做寻找。有人能指点我的方向吗?

由于Simon Marlow:

<$ p $ ghcmutterings.wordpress.com/2009/02/12/53/rel =noreferrer> neat litte function
$ b $ import $ $
$ import $
$ b $ {$#$ c $ {$#$ L $ b unsafeSizeof :: a - > Int
unsafeSizeof a =
case unpackClosure#a of
(#x,ptrs,nptrs#) - >
sizeOf(undefined :: Int)+ - 一个单词用于标题
I#(sizeofByteArray#(unsafeCoerce#ptrs)
+#sizeofByteArray#nptrs)


$ b

使用它:

  Prelude> ; :!ghc -c Size.hs 

Size.hs:15:18:
警告:忽略'BitVec257'
第三个参数中不可用的UNPACK杂注
在数据构造函数`BitVec257'的定义中
在`BitVec257'的数据类型声明中
Prelude Size> unsafeSizeof $!! BitVec514(BitVec257 1 2 True 3 4)(BitVec257 1 2 True 3 4)
74

(请注意,GHC告诉你,由于它是一种总和类型,它不能取消装箱 Bool 。)



以上函数声称您的数据类型在64位机器上使用74个字节。我觉得很难相信。我希望数据类型使用11个字= 88个字节,每个字段一个字。即使 Bool 也需要一个词,因为它们是指向(静态分配的)构造函数的指针。我不太确定这里发生了什么。



至于对齐,我相信每个字段都应该是字对齐的。


Recently, blog entries such as Computing the Size of a Hashmap explained how to reason about space complexities of commonly used container types. Now I'm facing the question of how to actually "see" which memory layout my GHC version chooses (depending on compile flags and target architecture) for weird data types (constructors) such as

data BitVec257 = BitVec257 {-# UNPACK #-} !Word64
                           {-# UNPACK #-} !Word64
                           {-# UNPACK #-} !Bool
                           {-# UNPACK #-} !Word64
                           {-# UNPACK #-} !Word64

data BitVec514 = BitVec514 {-# UNPACK #-} !BitVec257
                           {-# UNPACK #-} !BitVec257

In C there's the sizeof and offsetof operator, which allows me to "see" what size and alignment was chosen for the fields of C struct.

I've tried to look at GHC Core in the hope to find some hint there, but I didn't know what to look for. Can somebody point me in the right direction?

解决方案

My first idea was to use this neat litte function, due to Simon Marlow:

{-# LANGUAGE MagicHash,UnboxedTuples #-}
module Size where

import GHC.Exts
import Foreign

unsafeSizeof :: a -> Int
unsafeSizeof a =
  case unpackClosure# a of
    (# x, ptrs, nptrs #) ->
      sizeOf (undefined::Int) + -- one word for the header
        I# (sizeofByteArray# (unsafeCoerce# ptrs)
             +# sizeofByteArray# nptrs)

Using it:

Prelude> :!ghc -c Size.hs

Size.hs:15:18:
    Warning: Ignoring unusable UNPACK pragma on the
             third argument of `BitVec257'
    In the definition of data constructor `BitVec257'
    In the data type declaration for `BitVec257'
Prelude Size> unsafeSizeof $! BitVec514 (BitVec257 1 2 True 3 4) (BitVec257 1 2 True 3 4)
74

(Note that GHC is telling you that it cannot unbox Bool since it's a sum type.)

The above function claims that your data type uses 74 bytes on a 64-bit machine. I find that hard to believe. I'd expect the data type to use 11 words = 88 bytes, one word per field. Even Bools take one word, as they are pointer to (statically allocated) constructors. I'm not quite sure what's going on here.

As for alignment I believe every field should be word aligned.

这篇关于如何找出GHC的数据类型的内存表示?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆