为什么对小尺寸数据多次调用numpy.linalg.norm速度较慢? [英] why is numpy.linalg.norm slow when called many times for small size data?
问题描述
import numpy as np
from datetime import datetime
import math
def norm(l):
s = 0
for i in l:
s += i**2
return math.sqrt(s)
def foo(a, b, f):
l = range(a)
s = datetime.now()
for i in range(b):
f(l)
e = datetime.now()
return e-s
foo(10**4, 10**5, norm)
foo(10**4, 10**5, np.linalg.norm)
foo(10**2, 10**7, norm)
foo(10**2, 10**7, np.linalg.norm)
我得到以下输出:
0:00:43.156278
0:00:23.923239
0:00:44.184835
0:01:00.343875
似乎对于小型数据多次调用np.linalg.norm
时,它的运行速度比我的norm函数慢.
这是什么原因?
首先:datetime.now()
不适合用来衡量性能,它包含了空闲时间,您可能会选择不好的时间(对于您的计算机) ),当高优先级进程运行或Pythons GC启动时,...
Python提供了专用的计时功能/模块:内置的 timeit
模块或IPython中的 %timeit
/Jupyter和其他几个外部模块(例如 perf
,...)>
让我们看看如果我在您的数据上使用这些数据会发生什么情况
import numpy as np
import math
def norm(l):
s = 0
for i in l:
s += i**2
return math.sqrt(s)
r1 = range(10**4)
r2 = range(10**2)
%timeit norm(r1)
3.34 ms ± 150 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit np.linalg.norm(r1)
1.05 ms ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit norm(r2)
30.8 µs ± 1.53 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.linalg.norm(r2)
14.2 µs ± 313 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
对于短的可迭代对象,它并不慢,但它仍然更快.但是请注意,如果您已经拥有NumPy数组,那么NumPy函数的真正优势就在于:
a1 = np.arange(10**4)
a2 = np.arange(10**2)
%timeit np.linalg.norm(a1)
18.7 µs ± 539 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit np.linalg.norm(a2)
4.03 µs ± 157 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
是的,现在速度要快得多. 18.7us与1ms-10000个元素的速度几乎提高了100倍.这意味着在您的示例中,大部分时间np.linalg.norm
都用于将range
转换为np.array
.
import numpy as np
from datetime import datetime
import math
def norm(l):
s = 0
for i in l:
s += i**2
return math.sqrt(s)
def foo(a, b, f):
l = range(a)
s = datetime.now()
for i in range(b):
f(l)
e = datetime.now()
return e-s
foo(10**4, 10**5, norm)
foo(10**4, 10**5, np.linalg.norm)
foo(10**2, 10**7, norm)
foo(10**2, 10**7, np.linalg.norm)
I got the following output:
0:00:43.156278
0:00:23.923239
0:00:44.184835
0:01:00.343875
It seems like when np.linalg.norm
is called many times for small-sized data, it runs slower than my norm function.
What is the cause of that?
First of all: datetime.now()
isn't appropriate to measure performance, it includes the wall-time and you may just pick a bad time (for your computer) when a high-priority process runs or Pythons GC kicks in, ...
There are dedicated timing functions/modules available in Python: the built-in timeit
module or %timeit
in IPython/Jupyter and several other external modules (like perf
, ...)
Let's see what happens if I use these on your data:
import numpy as np
import math
def norm(l):
s = 0
for i in l:
s += i**2
return math.sqrt(s)
r1 = range(10**4)
r2 = range(10**2)
%timeit norm(r1)
3.34 ms ± 150 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit np.linalg.norm(r1)
1.05 ms ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit norm(r2)
30.8 µs ± 1.53 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.linalg.norm(r2)
14.2 µs ± 313 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
It isn't slower for short iterables it's still faster. However note that the real advantage from NumPy functions comes if you already have NumPy arrays:
a1 = np.arange(10**4)
a2 = np.arange(10**2)
%timeit np.linalg.norm(a1)
18.7 µs ± 539 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit np.linalg.norm(a2)
4.03 µs ± 157 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Yeah, it's quite a lot faster now. 18.7us vs. 1ms - almost 100 times faster for 10000 elements. That means most of the time of np.linalg.norm
in your examples was spent in converting the range
to a np.array
.
这篇关于为什么对小尺寸数据多次调用numpy.linalg.norm速度较慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!