为什么在 python3.4 和 python2.7 下 hash() 更慢 [英] Why is hash() slower under python3.4 vs python2.7

查看:35
本文介绍了为什么在 python3.4 和 python2.7 下 hash() 更慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 timeit 进行一些性能评估,并发现 python 2.7.10 和 python 3.4.3 之间的性能下降.我把它缩小到 hash() 函数:

蟒蛇 2.7.10:

<预><代码>>>>导入时间>>>timeit.timeit('for x in xrange(100): hash(x)', number=100000)0.4529099464416504>>>timeit.timeit('hash(1000)')0.044638872146606445

蟒蛇 3.4.3:

<预><代码>>>>导入时间>>>timeit.timeit('for x in range(100): hash(x)', number=100000)0.6459149940637872>>>timeit.timeit('hash(1000)')0.07708719989750534

那是大约.降级40%!整数、浮点数、字符串(unicodes 或 bytearrays)等是否被散列似乎并不重要;退化程度大致相同.在这两种情况下,哈希都返回一个 64 位整数.以上是在我的 Mac 上运行的,在 Ubuntu 机器上降级较小(20%).

我还在 python2.7 测试中使用了 PYTHONHASHSEED=random,在某些情况下,为每个案例"重新启动 python,我看到了 hash()性能变差了一点,但永远不会像python3.4一样慢

有人知道这是怎么回事吗?是否为 python3 选择了更安全但速度更慢的哈希函数?

解决方案

hash() 函数在 Python 2.7 和 Python 3.4 之间有两个变化

  1. 采用 SipHash
  2. 默认启用哈希随机化

<小时>

参考:

I was doing some performance evaluation using timeit and discovered a performance degredation between python 2.7.10 and python 3.4.3. I narrowed it down to the hash() function:

python 2.7.10:

>>> import timeit
>>> timeit.timeit('for x in xrange(100): hash(x)', number=100000)
0.4529099464416504
>>> timeit.timeit('hash(1000)')
0.044638872146606445

python 3.4.3:

>>> import timeit
>>> timeit.timeit('for x in range(100): hash(x)', number=100000)
0.6459149940637872
>>> timeit.timeit('hash(1000)')
0.07708719989750534

That's an approx. 40% degradation! It doesn't seem to matter if integers, floats, strings(unicodes or bytearrays), etc, are being hashed; the degradation is about the same. In both cases the hash is returning a 64-bit integer. The above was run on my Mac, and got a smaller degradation (20%) on an Ubuntu box.

I've also used PYTHONHASHSEED=random for the python2.7 tests and in some cases, restarting python for each "case", I saw the hash() performance get a bit worse, but never as slow as python3.4

Anyone know what's going on here? Was a more-secure, but slower, hash function chosen for python3 ?

解决方案

There are two changes in hash() function between Python 2.7 and Python 3.4

  1. Adoptions of SipHash
  2. Default enabling of Hash randomization


References:

这篇关于为什么在 python3.4 和 python2.7 下 hash() 更慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆