Haskell数据类型的内存占用 [英] Memory footprint of Haskell data types

查看:112
本文介绍了Haskell数据类型的内存占用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何找到在Haskell中存储某种数据类型的值所需的实际内存量(主要是使用GHC)?是否有可能在运行时对其进行评估(例如,在GHCi中)还是可以从它的组件估计组合数据类型的内存需求?一般而言,如果内存已知类型 a b 的需求是什么,代数数据类型的内存开销如下:

 数据Uno = Uno a 
数据到期=到期ab

例如,这些值在内存中占用了多少字节?

  1 :: Int8 
1 :: Integer
2 ^ 100 :: Integer
\ x - > x + 1
(1 :: Int8,2 :: Int8)
[1] :: [Int8]
Just(1 :: Int8)
Nothing

据我了解,由于垃圾回收延迟,实际内存分配较高。由于懒惰评估可能会有显着差异(并且thunk大小与值的大小无关)。问题是,在给定数据类型的情况下,在完全评估时它的值需要多少内存?



我发现有一个:set + s 选项在GHCi中查看内存统计信息,但不清楚如何估计单个值的内存占用情况。

解决方案

(以下适用于GHC,其他编译器可能会使用不同的存储约定)

经验法则:构造函数花费一个单词标题和每个字段的一个字。例外:没有字段的构造函数(如 Nothing True )没有空格,因为GHC会创建一个这些构造函数在所有用途中共享。



一个字在32位机器上是4个字节,在64位机器上是8个字节。



例如

 数据Uno = Uno a 
数据到期日=到期日

an Uno 需要2个单词,并且 Due



Int 类型定义为

  data Int = I#Int#

现在, Int#需要一个单词,所以 Int 总共需要2个单词。大多数unboxed类型需要一个词,例外是 Int64# Word64# Double #(在一个32位机器上),其中2个GHC实际上有一个 Int 和 Char ,所以在很多情况下这些都不占用堆空间。除非您使用 Char s> 255,否则 String 仅需要列表单元格的空间。



Int8 Int 具有相同的表示形式。 整数定义如下:

  data整数
= S#Int# - 小整数
| J#Int#ByteArray# - 大整数

如此小的整数 S#)需要2个单词,但是一个大整数需要一个可变数量的空间,具体取决于它的值。一个 ByteArray#需要2个单词(header + size)加上数组本身的空间。



注意 newtype 定义的构造函数是免费的 newtype 纯粹是一个编译时的想法,它不占用空间,在运行时不需要任何指令。



GHC注释中堆对象的布局

a>。


How can I find the actual amount of memory required to store a value of some data type in Haskell (mostly with GHC)? Is it possible to evaluate it at runtime (e.g. in GHCi) or is it possible to estimate memory requirements of a compound data type from its components?

In general, if memory requirements of types a and b are known, what is the memory overhead of algebraic data types such as:

data Uno = Uno a
data Due = Due a b

For example, how many bytes in memory do these values occupy?

1 :: Int8
1 :: Integer
2^100 :: Integer
\x -> x + 1
(1 :: Int8, 2 :: Int8)
[1] :: [Int8]
Just (1 :: Int8)
Nothing

I understand that actual memory allocation is higher due to delayed garbage collection. It may be significantly different due to lazy evaluation (and thunk size is not related to the size of the value). The question is, given a data type, how much memory does its value take when fully evaluated?

I found there is a :set +s option in GHCi to see memory stats, but it is not clear how to estimate the memory footprint of a single value.

解决方案

(The following applies to GHC, other compilers may use different storage conventions)

Rule of thumb: a constructor costs one word for a header, and one word for each field. Exception: a constructor with no fields (like Nothing or True) takes no space, because GHC creates a single instance of these constructors and shares it amongst all uses.

A word is 4 bytes on a 32-bit machine, and 8 bytes on a 64-bit machine.

So e.g.

data Uno = Uno a
data Due = Due a b

an Uno takes 2 words, and a Due takes 3.

The Int type is defined as

data Int = I# Int#

now, Int# takes one word, so Int takes 2 in total. Most unboxed types take one word, the exceptions being Int64#, Word64#, and Double# (on a 32-bit machine) which take 2. GHC actually has a cache of small values of type Int and Char, so in many cases these take no heap space at all. A String only requires space for the list cells, unless you use Chars > 255.

An Int8 has identical representation to Int. Integer is defined like this:

data Integer
  = S# Int#                            -- small integers
  | J# Int# ByteArray#                 -- large integers

so a small Integer (S#) takes 2 words, but a large integer takes a variable amount of space depending on its value. A ByteArray# takes 2 words (header + size) plus space for the array itself.

Note that a constructor defined with newtype is free. newtype is purely a compile-time idea, and it takes up no space and costs no instructions at run time.

More details in The Layout of Heap Objects in the GHC Commentary.

这篇关于Haskell数据类型的内存占用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆