为什么使用dtype np.int64的操作要比使用np.int16的相同操作慢得多? [英] why operations with dtype np.int64 are much slower compared to same operations with np.int16?

查看:481
本文介绍了为什么使用dtype np.int64的操作要比使用np.int16的相同操作慢得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的意思-a是1.000.000 np.int64个元素的向量,b是1.000.000 np.int16个元素的向量:

Here is what i mean - a is a vector of 1.000.000 np.int64 elements, b is a vector of 1.000.000 np.int16 elements:

In [19]: a = np.random.randint(100, size=(10**6), dtype="int64")

In [20]: b = np.random.randint(100, size=(10**6), dtype="int16")

不同操作的时间:

In [23]: %timeit a + 1
4.48 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [24]: %timeit b + 1
1.37 ms ± 14.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [25]: %timeit a / 10
5.77 ms ± 31.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [26]: %timeit b / 10
6.09 ms ± 70.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [27]: %timeit a * 10
4.52 ms ± 198 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [28]: %timeit b * 10
1.52 ms ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

当Numpy必须在内存中创建新的临时结果时,我可以理解这种差异-基础C代码必须在内存中复制/填充更多数据.

I can understand such a difference when Numpy will have to create a new temporary result in memory - the underlying C code will have to copy / fill much more data in memory.

但是我无法理解在如下所示的地方分配值的这种区别:

But I can't understand such difference for assigning values in place like the following:

In [21]: %timeit a[::2] = 111
409 µs ± 19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [22]: %timeit b[::2] = 111
203 µs ± 112 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

您是否知道为什么即使对于Numpy不必创建副本/视图的那些操作来说,它也会变慢呢?

Do you have an idea why is it that slower even for those operations where Numpy doesn't have to create a copy / view?

推荐答案

从内存读取需要花费一些时间.写入内存需要花费一些时间.您读入的数据量是读数据的四倍,写的数据量是读数据的四倍,而写操作则要快得多,因此有效地进行了I/O绑定. CPU的速度仅比内存快(随着时间的流逝,速度比越来越高).如果您要进行内存密集型工作,则较小的变量将运行得更快.

Reading from memory costs something. Writing to memory costs something. You're reading four times as much data in, and writing four times as much data out, and the work is so much faster than the reads/writes to memory that it's effectively I/O bound. CPUs are just faster than memory (and the speed ratio has been getting more and more extreme over time); if you're doing memory-intensive work, smaller variables will go faster.

这篇关于为什么使用dtype np.int64的操作要比使用np.int16的相同操作慢得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆