新类型比枚举更快吗? [英] Are newtypes faster than enumerations?

查看:96
本文介绍了新类型比枚举更快吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据这篇文章


就GHC而言,枚举不算作单一构造函数类型,因此在作为严格构造函数字段或严格函数参数使用时,它们不会受益于解包。这是GHC的一个缺陷,但它可以解决。

建议使用newtypes。然而,我无法使用以下代码来验证它:

  { - #LANGUAGE MagicHash,BangPatterns# - } 
{ - #OPTIONS_GHC -O2 -funbox-strict-fields -rtsopts -fllvm -optlc -x86-asm-syntax = intel# - }
模块Main(main,f,g)
其中
import GHC.Base
import Criterion.Main

data D = A | B | C
newtype E = E Int推导(Eq)

f :: D - > Int#
f z | z`seq` False = 3422#
f z =
的情况z A - > 1234#
B - > 5678#
C - > 9012#

g :: E - > Int#
g z | z`seq` False = 7432#
g z =
(E 0) - > 2345#
(E 1) - > 6789#
(E 2) - > 3535#

f'x = I#(fx)
g'x = I#(gx)

main :: IO()
main = defaultMain [bench'f'(whnf f'A)
,bench'g'(whnf g'(E 0))
]
pre>

查看程序集,枚举D的每个构造函数的标记实际上是解压缩的,并直接在指令中进行硬编码。此外,函数 f 缺少错误处理代码,比 g 快10%以上。在更现实的情况下,我将枚举转换为新类型后也经历了放缓。任何人都可以给我一些关于此的见解吗?感谢。

解决方案

这取决于用例。对于你有的功能,预计枚举性能会更好。基本上, D 的三个构造函数变成了 Int s和resp。 Int#当严格分析允许时,GHC知道它静态检查了参数只能有三个值之一 0#,1 #,2#,所以它不需要为 f 插入错误处理代码。对于 E ,没有给出三个值中只有一个可能的静态保证,所以它需要为 g ,这会显着降低速度。如果您更改 g 的定义,以便最后一个案例变为

  E _  - > 3535#

差异完全消失或几乎完全消失(我获得1% - 2%更好的基准 f 仍然存在,但我没有做足够多的测试来确定这是真正的差异还是基准测试工具。)

但这不是维基页面讨论的用例。它所谈论的是在类型是其他数据的组件时将构造函数解压缩到其他构造函数中,例如

  data FooD = FD !D!D!D 

data FooE = FE!E!E!E



然后,如果使用 -funbox-strict-fields 进行编译,则可以将三个 Int# s解压缩到 FooE 的构造函数,所以你基本上可以得到相当于

  struct FooE {
long x,y,z;
};

FooD 的字段有多个-constructor type D 并且不能被解压到构造函数 FD (1)中,所以这将基本上给你

  struct FooD {
long * px,* py,* pz;
}

这显然可以产生重大影响。

我不确定单构造函数参数的情况。这对于包含数据的类型(比如元组)来说有明显的优势,但我不明白这是如何应用于普通枚举的,在这些普通枚举中只有 case worker和wrapper是没有意义的(对我来说)。

无论如何,worker / wrapper转换并不是单个构造函数,构造函数专门化可以给与具有较少构造函数的类型相同的好处。 (创建多少构造函数将取决于 -fspec-constr-count 的值。)






(1)这可能已经改变,但我怀疑它。我还没有检查过,所以有可能页面已过期。


According to this article,

Enumerations don't count as single-constructor types as far as GHC is concerned, so they don't benefit from unpacking when used as strict constructor fields, or strict function arguments. This is a deficiency in GHC, but it can be worked around.

And instead the use of newtypes is recommended. However, I cannot verify this with the following code:

{-# LANGUAGE MagicHash,BangPatterns #-}
{-# OPTIONS_GHC  -O2 -funbox-strict-fields -rtsopts -fllvm -optlc --x86-asm-syntax=intel #-}
module Main(main,f,g)
where       
import GHC.Base  
import Criterion.Main

data D = A | B | C
newtype E = E Int deriving(Eq)

f :: D -> Int#
f z | z `seq` False = 3422#
f z = case z of
  A -> 1234#
  B -> 5678#
  C -> 9012#

g :: E -> Int#
g z | z `seq` False = 7432#
g z = case z of
  (E 0) -> 2345#
  (E 1) -> 6789#
  (E 2) -> 3535#

f' x = I# (f x)
g' x = I# (g x)

main :: IO ()
main = defaultMain [ bench "f" (whnf f' A) 
                   , bench "g" (whnf g' (E 0)) 
                   ]

Looking at the assembly, the tags for each constructor of the enumeration D is actually unpacked and directly hard-coded in the instruction. Furthermore, the function f lacks error-handling code, and more than 10% faster than g. In a more realistic case I have also experienced a slowdown after converting a enumeration to a newtype. Can anyone give me some insight about this? Thanks.

解决方案

It depends on the use case. For the functions you have, it's expected that the enumeration performs better. Basically, the three constructors of D become Ints resp. Int#s when the strictness analysis allows that, and GHC knows it's statically checked that the argument can only have one of the three values 0#, 1#, 2#, so it needs not insert error handling code for f. For E, the static guarantee of only one of three values being possible isn't given, so it needs to add error handling code for g, that slows things down significantly. If you change the definition of g so that the last case becomes

E _ -> 3535#

the difference vanishes completely or almost completely (I get a 1% - 2% better benchmark for f still, but I haven't done enough testing to be sure whether that's a real difference or an artifact of benchmarking).

But this is not the use case the wiki page is talking about. What it's talking about is unpacking the constructors into other constructors when the type is a component of other data, e.g.

data FooD = FD !D !D !D

data FooE = FE !E !E !E

Then, if compiled with -funbox-strict-fields, the three Int#s can be unpacked into the constructor of FooE, so you'd basically get the equivalent of

struct FooE {
    long x, y, z;
};

while the fields of FooD have the multi-constructor type D and cannot be unpacked into the constructor FD(1), so that would basically give you

struct FooD {
    long *px, *py, *pz;
}

That can obviously have significant impact.

I'm not sure about the case of single-constructor function arguments. That has obvious advantages for types with contained data, like tuples, but I don't see how that would apply to plain enumerations, where you just have a case and splitting off a worker and a wrapper makes no sense (to me).

Anyway, the worker/wrapper transformation isn't so much a single-constructor thing, constructor specialisation can give the same benefit to types with few constructors. (For how many constructors specialisations would be created depends on the value of -fspec-constr-count.)


(1) That might have changed, but I doubt it. I haven't checked it though, so it's possible the page is out of date.

这篇关于新类型比枚举更快吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆