Python 3.5与3.6的对比比理解慢 [英] Python 3.5 vs. 3.6 what made "map" slower compared to comprehensions

查看:116
本文介绍了Python 3.5与3.6的对比比理解慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有时我会使用 map ,如果有一个用C编写的函数/方法获得了一些额外的性能。但是,最近我重新访问了一些基准测试,并注意到相对性能(与类似的列表理解相比)在Python 3.5和3.6之间发生了巨大变化。



这不是实际的代码但是只是一个说明差异的最小样本:

  import random 

lst = [random.randint (0,10)for _ in range(100000)]
断言列表(map((5).__ lt__,lst))== [5< i for i in lst]
%timeit list(map((5).__ lt__,lst))
%timeit [5<我首先是我)

我意识到使用(5).__ lt __ ,但我现在无法提供一个有用的示例。



Python-3.5的计时在赞成 map 方法:

  15.1 ms±5.64 µs每个循环(平均±标准偏差的7个运行,每个100个循环)
每个回路16.7 ms±35.6 µs(平均±标准偏差的7个运行,每个100个循环)

虽然Python-3.6计时实际上表明理解速度更快:

  17.9 ms±755 µs每个循环(平均值±标准偏差,共7次运行,每个循环100次)
14.3 ms±128 µs,每个循环(平均值±标准偏差,共7个周期运行,每个循环100个)

我的问题是在这种情况下发生了什么,这使得列表理解速度更快和 map 解决方案较慢?我意识到区别并不大,只是让我感到好奇,因为那是我有时(实际上很少)用在性能关键代码中的技巧之一。

解决方案

我认为公平的比较涉及在Python 3.5和3.6中使用相同的函数和相同的测试条件,以及将 map



在我最初的回答中,我进行了多项测试,表明 map 与列表理解相比,在两个版本的Python中仍要快两倍。但是,有些结果不是结论性的,因此我进行了更多测试。



首先,让我引用问题中所述的一些观点:


... [I]注意到[ map ]的相对性能(与类似的列表理解)在Python 3.5和3.6之间发生了重大改变


您还问:


我的问题是,在这种情况下,使得列表理解更快,地图解决方案更慢了吗?


并不是很清楚,是否意味着map比Python 3.6中的列表理解要慢,或者是否意味着map是Python 3.6比3.5慢,并且列表理解的性能有所提高(尽管不一定达到击败 map 的水平)。



根据我对此问题的第一个答案后进行的更广泛的测试



但是,首先让我们为公平比较创造条件。为此,我们需要:


  1. 比较 map 的性能使用相同功能的Python版本;


  2. 比较 map 的性能以列出使用相同版本的理解相同的功能;


  3. 对相同数据运行测试;


  4. 最小化时序贡献功能。


以下是有关我的系统的版本信息:

  Python 3.5.3 | Continuum Analytics,Inc. | (默认值,2017年3月6日,12:15:08)
[darby
上的[GCC 4.2.1兼容Apple LLVM 6.0(clang-600.0.57)]] IPython 5.3.0-增强的交互式Python 。

  Python 3.6.2 | Continuum Analytics,Inc. | (默认值,2017年7月20日,13:14:59)
[达尔文
上的[GCC 4.2.1兼容Apple LLVM 6.0(clang-600.0.57)]] IPython 6.1.0-增强的交互式Python 。键入?以获取帮助。

我们首先解决相同数据的问题。不幸的是,由于您实际上使用的是 seed(None),因此两个版本的每个数据集 lst 都不同蟒蛇。这可能会导致两个Python版本上的性能差异。一种解决方法是设置例如 random.seed(0)(或类似的东西)。我选择创建一次列表,然后使用 numpy.save()保存该列表,然后在每个版本中加载它。这一点尤其重要,因为我选择对测试进行一些修改(循环和重复的数量),并且将数据集的长度增加到100,000,000:

 将numpy导入为np 
导入随机
lst = [_范围(100000000)中的_random.randint(0,10)]
np.save( 'lst',lst,allow_pickle = False)

第二,让我们使用 timeit 模块,而不是IPython的魔术命令%timeit 。这样做的原因来自在Python 3.5中执行的以下测试:

 在[11]中:f =(5)。 __lt__ 
在[12]中:%timeit -n1 -r20 [f(i)for i in lst]
1个循环,最好20个:每个循环9.01 s

将其与 timeit 在相同版本的Python中的结果进行比较:

 >> t = timeit.repeat('[[f(i)for i in lst]'',setup = f =(5).__ lt__; 
... import numpy; lst = numpy.load('lst.npy ').tolist(),重复= 20,
... number = 1); print(min(t),max(t),np.mean(t),np.std(t))
7.442819457995938 7.703615028003696 7.5105415405 0.0550515642854

出于我不知道的原因,与 timeit相比,IPython的魔法%timeit 增加了一些时间。 / code>软件包。因此,我将在测试中仅使用 timeit



注意:在下面的讨论中,我将仅使用最小时间( min(t))。



Python 3.5.3中的测试:



第1组:地图和列表理解测试



 >>将numpy导入为np 
>>导入时间

>> t = timeit.repeat('list(map(f,lst))',setup = f =(5).__ lt__; import numpy; lst = numpy.load('lst.npy')。tolist(),重复= 20,数字= 1); print(min(t),max(t),np.mean(t),np.std(t))
4.666553302988177 4.811194089008495 4.72791638025 0.041115884397

>> t = timeit.repeat('[[f(i)for i in lst]]',setup = f =(5).__ lt__; import numpy; lst = numpy.load('lst.npy')。tolist() ,重复= 20,数字= 1); print(min(t),max(t),np.mean(t),np.std(t))
7.442819457995938 7.703615028003696 7.5105415405 0.0550515642854

>> t = timeit.repeat('[[5< i for i in lst]'',setup = import numpy; lst = numpy.load('lst.npy')。tolist(),重复= 20,数字= 1 ); print(min(t),max(t),np.mean(t),np.std(t))
4.94656751700677 5.07807950800634 5.00670203845 0.0340474956945

>> t = timeit.repeat('list(map(abs,lst))',setup = import numpy; lst = numpy.load('lst.npy')。tolist(),repeat = 20,number = 1) ; print(min(t),max(t),np.mean(t),np.std(t))
4.167273573024431 4.320013975986512 4.2408865186 0.0378852782878

>> t = timeit.repeat('[在lst中用于i的abs(i)]',setup =导入numpy; lst = numpy.load('lst.npy')。tolist(),重复= 20,数字= 1 ); print(min(t),max(t),np.mean(t),np.std(t))
5.664627838006709 5.837686392012984 5.71560354655 0.0456700607748

请注意第二项测试(使用 f(i)的列表理解)比第三项测试(使用 5< i )表示 f =(5).__ lt __ 与<$不相同(或几乎相同) c $ c> 5< i 从代码角度来看。



第2组:单个功能测试



 >> t = timeit.repeat('f(1)',setup = f =(5).__ lt__,repeat = 20,number = 1000000); print(min(t),max(t),np.mean(t),np.std(t))
0.052280781004810706 0.05500587198184803 0.0531139718529 0.000877649561967

>> t = timeit.repeat( 5< 1,重复= 20,数字= 1000000); print(min(t),max(t),np.mean(t),np.std(t))
0.030931947025237605 0.033691533986711875 0.0314959864045 0.000633274658428

>> t = timeit.repeat(‘abs(1)’,repeat = 20,number = 1000000); print(min(t),max(t),np.mean(t),np.std(t))
0.04685414198320359 0.05405496899038553 0.0483296330043 0.00162837880358

请注意,第一次测试( f(1))再次比第二次测试( 5< 1 )进一步支持 f =(5)。__lt __ 与<$ c $不同(或几乎相同) c> 5< i 从代码角度来看。



Python 3.6.2中的测试:



第1组:地图和列表理解测试



 >>将numpy导入为np 
>>导入时间

>> t = timeit.repeat('list(map(f,lst))',setup = f =(5).__ lt__; import numpy; lst = numpy.load('lst.npy')。tolist(),重复= 20,数字= 1); print(min(t),max(t),np.mean(t),np.std(t))
4.599696700985078 4.743880658003036 4.6631793691 0.0425774678203

>> t = timeit.repeat('[[f(i)for i in lst]'',setup = f =(5).__ lt__; import numpy; lst = numpy.load('lst.npy')。tolist() ,重复= 20,数字= 1); print(min(t),max(t),np.mean(t),np.std(t))
7.316072431014618 7.572676292009419 7.3837024617 0.0574811241553

>> t = timeit.repeat('[5< i for i in lst]',setup = import numpy; lst = numpy.load('lst.npy')。tolist(),重复= 20,数字= 1 ); print(min(t),max(t),np.mean(t),np.std(t))
4.570452399988426 4.679144663008628 4.61264215875 0.0265541828693

>> t = timeit.repeat('list(map(abs,lst))',setup = import numpy; lst = numpy.load('lst.npy')。tolist(),repeat = 20,number = 1) ; print(min(t),max(t),np.mean(t),np.std(t))
2.742673939006636 2.8282236389932223 2.78504617405 0.0260357089928

>> t = timeit.repeat('[在lst中用于i的abs(i)]',setup =导入numpy; lst = numpy.load('lst.npy')。tolist(),重复= 20,数字= 1 ); print(min(t),max(t),np.mean(t),np.std(t))
6.2177103200228885 6.428813881997485 6.28722427145 0.0493010620999



第2组:单个功能测试



 >>> ; t = timeit.repeat('f(1)',setup = f =(5).__ lt__,repeat = 20,number = 1000000); print(min(t),max(t),np.mean(t),np.std(t))
0.051936342992121354 0.05764096099301241 0.0532974587506 0.00117079475737

> t = timeit.repeat( 5< 1,重复= 20,数字= 1000000); print(min(t),max(t),np.mean(t),np.std(t))
0.02675032999832183 0.032919151999522 0.0285137565021 0.00156522182488

>> t = timeit.repeat(‘abs(1)’,repeat = 20,number = 1000000); print(min(t),max(t),np.mean(t),np.std(t))
0.047831349016632885 0.0531779529992491 0.0482893927969 0.00112825297875

请注意,第一次测试( f(1))再次比第二次测试( 5< 1 )进一步支持 f =(5)。__lt __ 与<$ c $不同(或几乎相同) c> 5< i 从代码角度来看。



讨论



我不知道可靠性如何这些是时序测试,并且很难分离出所有导致这些时序结果的因素。但是,我们可以从第2组测试中注意到,唯一显着改变其时间安排的个人测试是 5 < 1 :从Python 3.5中的0.0309s下降到Python 3.6中的0.0268s。这使得Python 3.6中的列表理解测试使用 5<我比Python 3.5中的类似测试运行得更快。但是,这并不意味着列表理解在Python 3.6中变得更快。



让我们比较 map 相对性能,以列出对相同功能的理解相同的Python版本。然后我们进入Python 3.5: r(f)= 7.4428 / 4.6666 = 1.595 r(abs)= 5.665 / 4.167 = 1.359 和在Python 3.6中: r(f)= 7.316 / 4.5997 = 1.591 r(abs)= 6.218 / 2.743 = 2.267 。基于这些相对性能,我们可以看到在Python 3.6中,相对于列表理解的性能, map 的性能至少与Python 3.5中的 f =(5).__ lt __ 函数,对于Python 3.6中的 abs()这样的函数,此比率甚至有所提高。 / p>

无论如何,我相信没有证据表明列表理解在Python 3.6中无论从相对意义还是绝对意义上都变得更快。唯一的性能改进是 [5< i for i in lst] 测试,但这是因为 5< i 本身在Python 3.6中变得更快,而不是由于列表理解本身变得更快。


I sometimes used map if there was a function/method that was written in C to get a bit extra performance. However I recently revisited some of my benchmarks and noticed that the relative performance (compared to a similar list comprehension) drastically changed between Python 3.5 and 3.6.

That's not the actual code but just a minimal sample that illustrates the difference:

import random

lst = [random.randint(0, 10) for _ in range(100000)]
assert list(map((5).__lt__, lst)) == [5 < i for i in lst]
%timeit list(map((5).__lt__, lst))
%timeit [5 < i for i in lst]

I realize that it's not a good idea to use (5).__lt__ but I couldn't come up with a useful example right now.

The timings on Python-3.5 were in favor of the map approach:

15.1 ms ± 5.64 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
16.7 ms ± 35.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

While the Python-3.6 timings actually show that the comprehension is faster:

17.9 ms ± 755 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
14.3 ms ± 128 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

My question is what happened in this case that made the list-comprehension faster and the map solution slower? I realize the difference isn't that much, it just made me curious because that was one of the "tricks" I sometimes (actually seldom) used in performance critical codes.

解决方案

I think a fair comparison involves using the same function and same testing conditions in Python 3.5 and 3.6 as well as when comparing map to list comprehension in a chosen Python version.

In my initial answer I have performed multiple tests that showed that map was still faster by a factor of about two in both versions of Python when compared to list comprehension. However some results were not conclusive and so I performed some more tests.

First let me cite some of your points stated in the question:

"... [I] noticed that the relative performance [of map] (compared to a similar list comprehension) drastically changed between Python 3.5 and 3.6"

You also ask:

"My question is what happened in this case that made the list-comprehension faster and the map solution slower?"

It is not very clear if you mean that map is slower than list comprehension in Python 3.6 or if you mean that map is slower in Python 3.6 than in 3.5 and list comprehension's performance has increased (albeit not necessarily to the level of beating map).

Based on more extensive tests that I have performed after my first answer to this question, I think I have an idea of what is going on.

However, first let's create conditions for "fair" comparisons. For this we need to:

  1. Compare performance of map in different Python versions using the same function;

  2. Compare performance of map to list comprehension in the same version using same function;

  3. Run the tests on same data;

  4. Minimize contribution from timing functions.

Here is version information about my system:

Python 3.5.3 |Continuum Analytics, Inc.| (default, Mar  6 2017, 12:15:08) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
IPython 5.3.0 -- An enhanced Interactive Python.

and

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:14:59) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.

Let's first address the issue of "same data". Unfortunately because you effectively are using seed(None), each data set lst is different on each of the two versions of Python. This probably contributes to the difference in performance seen on two Python versions. One fix would be to set, e.g., random.seed(0) (or something like that). I chose to create the list once and save it using numpy.save() and then load it in each version. This is especially important because I chose to modify your tests slightly (number of "loops" and "repeats") and I have increased the length of your dataset to 100,000,000:

import numpy as np
import random
lst = [random.randint(0, 10) for _ in range(100000000)]
np.save('lst', lst, allow_pickle=False)

Second, let's use timeit module instead of IPython's magic command %timeit. The reason for doing this comes from the following test performed in Python 3.5:

In [11]: f = (5).__lt__
In [12]: %timeit -n1 -r20 [f(i) for i in lst]
1 loop, best of 20: 9.01 s per loop

Compare this to the result of timeit in same version of Python:

>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__;
... import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, 
... number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.442819457995938 7.703615028003696 7.5105415405 0.0550515642854

For unknown to me reasons, IPython's magic %timeit is adding some time compared to timeit package. Therefore, I will use timeit exclusively in my testing.

NOTE: In the discussions that follows I will use only minimum timing (min(t)).

Tests in Python 3.5.3:

Group 1: map and list comprehension tests

>>> import numpy as np
>>> import timeit

>>> t = timeit.repeat('list(map(f, lst))', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.666553302988177 4.811194089008495 4.72791638025 0.041115884397

>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.442819457995938 7.703615028003696 7.5105415405 0.0550515642854

>>> t = timeit.repeat('[5 < i for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.94656751700677 5.07807950800634 5.00670203845 0.0340474956945

>>> t = timeit.repeat('list(map(abs, lst))', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.167273573024431 4.320013975986512 4.2408865186 0.0378852782878

>>> t = timeit.repeat('[abs(i) for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
5.664627838006709 5.837686392012984 5.71560354655 0.0456700607748

Notice how second test (list comprehension using f(i)) is significantly slower than third test (list comprehension using 5 < i) indicating that f = (5).__lt__ is not identical (or almost identical) to 5 < i from the code perspective.

Group 2: "individual" function tests

>>> t = timeit.repeat('f(1)', setup="f = (5).__lt__", repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.052280781004810706 0.05500587198184803 0.0531139718529 0.000877649561967

>>> t = timeit.repeat('5 < 1', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.030931947025237605 0.033691533986711875 0.0314959864045 0.000633274658428

>>> t = timeit.repeat('abs(1)', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.04685414198320359 0.05405496899038553 0.0483296330043 0.00162837880358

Notice how again first test (of f(1)) is significantly slower than second test (of 5 < 1) further supporting that f = (5).__lt__ is not identical (or almost identical) to 5 < i from the code perspective.

Tests in Python 3.6.2:

Group 1: map and list comprehension tests

>>> import numpy as np
>>> import timeit

>>> t = timeit.repeat('list(map(f, lst))', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.599696700985078 4.743880658003036 4.6631793691 0.0425774678203

>>> t = timeit.repeat('[f(i) for i in lst]', setup="f = (5).__lt__; import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
7.316072431014618 7.572676292009419 7.3837024617 0.0574811241553

>>> t = timeit.repeat('[5 < i for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
4.570452399988426 4.679144663008628 4.61264215875 0.0265541828693

>>> t = timeit.repeat('list(map(abs, lst))', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
2.742673939006636 2.8282236389932223 2.78504617405 0.0260357089928

>>> t = timeit.repeat('[abs(i) for i in lst]', setup="import numpy; lst = numpy.load('lst.npy').tolist()", repeat=20, number=1); print(min(t), max(t), np.mean(t), np.std(t))
6.2177103200228885 6.428813881997485 6.28722427145 0.0493010620999

Group 2: "individual" function tests

>>> t = timeit.repeat('f(1)', setup="f = (5).__lt__", repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.051936342992121354 0.05764096099301241 0.0532974587506 0.00117079475737

>>> t = timeit.repeat('5 < 1', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.02675032999832183 0.032919151999522 0.0285137565021 0.00156522182488

>>> t = timeit.repeat('abs(1)', repeat=20, number=1000000); print(min(t), max(t), np.mean(t), np.std(t))
0.047831349016632885 0.0531779529992491 0.0482893927969 0.00112825297875

Notice how again first test (of f(1)) is significantly slower than second test (of 5 < 1) further supporting that f = (5).__lt__ is not identical (or almost identical) to 5 < i from the code perspective.

Discussion

I do not know how reliable are these timing tests and it is also difficult to separate all factors that contribute to these timing results. However we can notice from the "Group 2" of tests that the only "individual" test that significantly changed its timing is the test of 5 < 1: it went down to 0.0268s in Python 3.6 from 0.0309s in Python 3.5. This makes the list comprehension test in Python 3.6 that uses 5 < i to run faster than a similar test in Python 3.5. However, this does not mean that list comprehension become faster in Python 3.6.

Let's compare relative performance of map to list comprehension for the same function in the same Python version. Then we get in Python 3.5: r(f) = 7.4428/4.6666 = 1.595, r(abs) = 5.665/4.167 = 1.359 and in Python 3.6: r(f) = 7.316/4.5997 = 1.591, r(abs) = 6.218/2.743 = 2.267. Based on these relative performances we can see that in Python 3.6 performance of the map relative to the performance of list comprehension is at least the same as in Python 3.5 for the f = (5).__lt__ function and this ratio has even improved for a function such as abs() in Python 3.6.

In any case, I believe that there is no evidence that list comprehension became faster in Python 3.6 neither in the relative nor in the absolute sense. The only performance improvement is for [5 < i for i in lst] test but that is because 5 < i itself became faster in Python 3.6 and not due to the list comprehension itself being faster.

这篇关于Python 3.5与3.6的对比比理解慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆