应该使用哪个运算符(+ vs + =)来提高性能? (就地与非就地) [英] Which operator (+ vs +=) should be used for performance? (In-place Vs not-in-place)
问题描述
假设我在python中有这两个代码段:
Let's say I have this two snippet of code in python :
1 --------------------------
import numpy as np
x = np.array([1,2,3,4])
y = x
x = x + np.array([1,1,1,1])
print y
2 --------------------------
import numpy as np
x = np.array([1,2,3,4])
y = x
x += np.array([1,1,1,1])
print y
我认为在两个示例中y
的结果都将相同,因为y
指向x
,并且x
变为(2,3,4,5)
,但是并不是
I thought the result of y
will be the same in both examples since y
point out to x
and x
become (2,3,4,5)
, BUT it wasn't
结果为(1,2,3,4) for 1
和(2,3,4,5) for 2
.
经过研究,我发现在第一个示例
#-First example---------------------------------------
x = np.array([1,2,3,4]) # create x --> [1,2,3,4]
y = x # made y point to x
# unril now we have x --> [1,2,3,4]
# |
# y
x = x + np.array([1,1,1,1])
# however this operation **create a new array** [2,3,4,5]
# and made x point to it instead of the first one
# so we have y --> [1,2,3,4] and x --> [2,3,4,5]
#-Second example--------------------------------------
x = np.array([1,2,3,4]) # create x --> [1,2,3,4]
y = x # made y point to x
# unril now the same x --> [1,2,3,4]
# |
# y
x += np.array([1,1,1,1])
# this operation **Modify the existing array**
# so the result will be
# unril now the same x --> [2,3,4,5]
# |
# y
您可以在此链接中找到有关此行为的更多信息(不仅针对此示例) 就地算法
You can find out more about this behaviors (not only for this example) in this link In-place algorithm
我的问题是:了解此行为后,为什么要在性能方面使用就地算法? (执行时间更快?内存分配更少?.)
My question is : Being aware of this behavior why should I use in-place algorithm in term of performance? (time of excution faster? less memory alocation?..)
澄清
(+,= +)的示例只是向不认识的人简单地解释就地算法.但是问题是,通常不仅在这种情况下,就地算法的使用..
The example of (+, =+) was just to explain simply the in-place algorithm to the one who don't know.. but the question was in general the use of in-place algorithm not only in this case..
另一个更复杂的示例:将CSV文件(仅一千万行)加载到变量中,然后对结果进行排序,就地算法的思想是在包含输入内容的同一存储空间中生成输出,连续地转换数据,直到产生输出? -这样就避免了使用两次存储的需要-一个区域用于输入,一个大小相同的区域用于输出(使用最少的RAM,硬盘...)
As another more complex example: loading a CSV file (just 10 Million rows) in a variable then sorting the result, is the idea of in-place algorithm is to produce an output in the same memory space that contains the input by successively transforming that data until the output is produced? - This avoids the need to use twice the storage - one area for the input and an equal-sized area for the output ( Using the minimum amount of RAM, hard disk ... )
推荐答案
x = x + 1 vs x + = 1
性能
似乎您了解x += 1
和x = x + 1
之间的语义差异.
x = x + 1 vs x += 1
Performance
It seems that you understand the semantical difference between x += 1
and x = x + 1
.
对于基准测试,可以在IPython中使用 timeit .
For benchmarking, you can use timeit in IPython.
定义这些功能后:
import numpy as np
def in_place(n):
x = np.arange(n)
x += 1
def not_in_place(n):
x = np.arange(n)
x = x + 1
def in_place_no_broadcast(n):
x = np.arange(n)
x += np.ones(n, dtype=np.int)
您可以简单地使用 %timeit
语法来比较性能:
You can simply use the %timeit
syntax to compare performances:
%timeit in_place(10**7)
20.3 ms ± 81.4 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit not_in_place(10**7)
30.4 ms ± 253 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit in_place_no_broadcast(10**7)
35.4 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
not_in_place
比in_place
慢50%.
请注意,广播也会使巨大的区别:numpy将x += 1
理解为在x
的每个单个元素中添加了1
,而无需创建另一个数组.
Note that broadcasting also makes a huge difference : numpy understands x += 1
as adding a 1
to every single element of x
, without having to create yet another array.
in_place
应该是首选功能:更快,使用更少的内存.但是,如果在代码的不同位置使用和更改此对象,则可能会遇到bug.典型的例子是:
in_place
should be the preferred function: it's faster and uses less memory. You might run into bugs if you use and mutate this object at different places in your code, though. The typical example would be :
x = np.arange(5)
y = [x, x]
y[0][0] = 10
y
# [array([10, 1, 2, 3, 4]), array([10, 1, 2, 3, 4])]
排序
您对就地分拣的优势的理解是正确的.在对大型数据集进行排序时,可能会在内存需求方面产生巨大差异.
Sorting
Your understanding of the advantages of in-place sorting is correct. It can make a huge difference in memory requirements when sorting large data sets.
排序算法还有其他一些理想的功能(稳定,可接受的最坏情况下的复杂性……),并且看起来像标准的Python算法(其中很多.
There are other desirable features for a sorting algorithm (stable, acceptable worst-case complexity, ...) and it looks like the standard Python algorithm (Timsort) has many of them.
Timsort是一种混合算法.它的某些部分就位,而某些则需要额外的内存.不过,它永远不会使用 n/2
.
Timsort is an hybrid algorithm. Some parts of it are in-place and some require extra memory. It will never use more than n/2
though.
这篇关于应该使用哪个运算符(+ vs + =)来提高性能? (就地与非就地)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!