什么是 numpy.core._multiarray_umath.implement_array_function 以及为什么它花费大量时间? [英] what is numpy.core._multiarray_umath.implement_array_function and why it costs lots of time?
问题描述
我使用 numpy 进行大规模数据分析,有很多矩阵实现(例如,dot
、count_nonzero
、linalg.svd
).在 Jupyter notebook 中 %prun
之后,我发现 numpy.core._multiarray_umath.implement_array_function
花费了很多时间,38 秒,>250 秒 cumtime
,带有大量 ncall
(67139/66979).我知道应该优化其他功能,但我认为也可以抑制它,这是用来做什么的?
I used numpy to do large scale data analysis, with lots of matrix implementations (e.g., dot
, count_nonzero
, linalg.svd
). After %prun
in Jupyter notebook, I found that numpy.core._multiarray_umath.implement_array_function
costs lots of time, 38 sec out of total 250 sec cumtime
with large number of ncall
(67139/66979). I know other functions should be optimized, but I think is it possible to suppress this as well, and what is this used for?
这是我的 %prun
输出:
ncalls tottime percall cumtime percall filename:lineno(function)
1848 203.845 0.110 242.582 0.131 stacking.py:130(_rda_cv)
67139/66979 27.980 0.000 38.901 0.001 {built-in method numpy.core._multiarray_umath.implement_array_function}
4 8.181 2.045 251.415 62.854 stacking.py:192(_model_selection)
14883 7.942 0.001 7.942 0.001 {method 'reduce' of 'numpy.ufunc' objects}
11096 2.107 0.000 2.353 0.000 linalg.py:1468(svd)
4 0.154 0.038 0.188 0.047 stacking.py:20(_get_qvalues)
1 0.149 0.149 251.887 251.887 stacking.py:255(fit)
16 0.149 0.009 0.508 0.032 stacking.py:70(_construct_cov)
26341 0.140 0.000 0.140 0.000 {built-in method numpy.array}
4 0.132 0.033 0.609 0.152 stacking.py:89(_construct_cov_cv)
11164 0.114 0.000 0.367 0.000 _methods.py:134(_mean)
1919 0.102 0.000 0.102 0.000 {built-in method numpy.empty}
36989 0.073 0.000 0.073 0.000 {method 'astype' of 'numpy.ndarray' objects}
11132 0.052 0.000 0.383 0.000 fromnumeric.py:3153(mean)
32 0.052 0.002 0.302 0.009 function_base.py:2245(cov)
38870 0.052 0.000 27.967 0.001 <__array_function__ internals>:2(dot)
11164 0.051 0.000 0.054 0.000 _methods.py:50(_count_reduce_items)
11096 0.043 0.000 0.070 0.000 linalg.py:144(_commonType)
13 0.036 0.003 0.036 0.003 {method 'argsort' of 'numpy.ndarray' objects}
3696 0.035 0.000 7.909 0.002 numeric.py:409(count_nonzero)
11096 0.033 0.000 0.064 0.000 linalg.py:116(_makearray)
66728 0.031 0.000 0.031 0.000 {built-in method builtins.issubclass}
11096 0.027 0.000 2.407 0.000 <__array_function__ internals>:2(svd)
11145 0.026 0.000 0.026 0.000 {method 'flatten' of 'numpy.ndarray' objects}
11096 0.024 0.000 0.024 0.000 linalg.py:111(get_linalg_error_extobj)
348583 0.023 0.000 0.023 0.000 {method 'append' of 'list' objects}
11132 0.021 0.000 0.421 0.000 <__array_function__ internals>:2(mean)
7408 0.018 0.000 0.034 0.000 numerictypes.py:293(issubclass_)
3696 0.017 0.000 7.940 0.002 <__array_function__ internals>:2(count_nonzero)
3704 0.017 0.000 0.053 0.000 numerictypes.py:365(issubdtype)
5544 0.017 0.000 0.017 0.000 stacking.py:146(<dictcomp>)
22192 0.016 0.000 0.025 0.000 linalg.py:134(_realType)
40 0.016 0.000 0.016 0.000 {method 'sort' of 'numpy.ndarray' objects}
3702 0.013 0.000 7.795 0.002 {method 'sum' of 'numpy.ndarray' objects}
15009 0.012 0.000 0.028 0.000 _asarray.py:88(asanyarray)
5 0.012 0.002 0.053 0.011 _split.py:628(_make_test_folds)
22192 0.010 0.000 0.013 0.000 linalg.py:121(isComplexType)
22602 0.010 0.000 0.010 0.000 {built-in method builtins.isinstance}
13199 0.010 0.000 0.010 0.000 {built-in method builtins.getattr}
11264 0.010 0.000 0.025 0.000 _asarray.py:16(asarray)
11096 0.009 0.000 0.009 0.000 linalg.py:203(_assertRankAtLeast2)
22196 0.009 0.000 0.009 0.000 {method 'get' of 'dict' objects}
1964 0.009 0.000 0.009 0.000 {method 'argmax' of 'numpy.ndarray' objects}
11132 0.008 0.000 0.008 0.000 {built-in method __new__ of type object at 0x00007FF847CE9BA0}
38870 0.008 0.000 0.008 0.000 multiarray.py:707(dot)
11625 0.008 0.000 0.008 0.000 {built-in method builtins.hasattr}
45 0.007 0.000 0.038 0.001 arraysetops.py:297(_unique1d)
60/20 0.006 0.000 0.059 0.003 _split.py:74(split)
1964 0.006 0.000 0.034 0.000 <__array_function__ internals>:2(argmax)
1964 0.006 0.000 0.023 0.000 fromnumeric.py:1091(argmax)
3702 0.005 0.000 7.782 0.002 _methods.py:36(_sum)
4 0.005 0.001 0.221 0.055 stacking.py:317(_normalizer)
1982 0.004 0.000 0.044 0.000 fromnumeric.py:55(_wrapfunc)
22192 0.004 0.000 0.004 0.000 {method '__array_prepare__' of 'numpy.ndarray' objects}
11096 0.004 0.000 0.004 0.000 linalg.py:1464(_svd_dispatcher)
40 0.003 0.000 0.004 0.000 _split.py:107(_iter_test_masks)
11132 0.003 0.000 0.003 0.000 fromnumeric.py:3149(_mean_dispatcher)
3696 0.003 0.000 0.003 0.000 numeric.py:405(_count_nonzero_dispatcher)
3 0.003 0.001 0.005 0.002 stacking.py:243(_rda_prediction)
20 0.002 0.000 0.055 0.003 _split.py:680(_iter_test_masks)
1 0.002 0.002 251.889 251.889 <string>:1(<module>)
48 0.002 0.000 0.002 0.000 {built-in method numpy.zeros}
25 0.002 0.000 0.002 0.000 {built-in method numpy.arange}
4 0.001 0.000 0.001 0.000 {method 'partition' of 'numpy.ndarray' objects}
5 0.001 0.000 0.001 0.000 {method 'cumsum' of 'numpy.ndarray' objects}
45 0.001 0.000 0.039 0.001 arraysetops.py:151(unique)
1964 0.001 0.000 0.001 0.000 fromnumeric.py:1087(_argmax_dispatcher)
5 0.001 0.000 0.011 0.002 multiclass.py:174(type_of_target)
116 0.001 0.000 0.002 0.000 fromnumeric.py:42(_wrapit)
32 0.001 0.000 0.001 0.000 stride_tricks.py:116(_broadcast_to)
32 0.000 0.000 0.038 0.001 function_base.py:293(average)
4 0.000 0.000 0.001 0.000 stacking.py:107(_calculate_weights)
120 0.000 0.000 0.001 0.000 <__array_function__ internals>:2(where)
115 0.000 0.000 0.001 0.000 validation.py:127(_num_samples)
40 0.000 0.000 0.001 0.000 _split.py:430(_iter_test_indices)
135 0.000 0.000 0.000 0.000 {built-in method _abc._abc_instancecheck}
60/20 0.000 0.000 0.060 0.003 _split.py:299(split)
30 0.000 0.000 0.001 0.000 validation.py:238(indexable)
5 0.000 0.000 0.001 0.000 validation.py:362(check_array)
1 0.000 0.000 251.889 251.889 {built-in method builtins.exec}
5 0.000 0.000 0.000 0.000 {method 'nonzero' of 'numpy.ndarray' objects}
4 0.000 0.000 0.002 0.001 function_base.py:3508(_median)
130 0.000 0.000 0.000 0.000 {built-in method _abc._abc_subclasscheck}
5 0.000 0.000 0.000 0.000 function_base.py:1147(diff)
1 0.000 0.000 0.003 0.003 stacking.py:350(_check_y)
32 0.000 0.000 0.321 0.010 <__array_function__ internals>:2(cov)
4 0.000 0.000 0.000 0.000 utils.py:1142(_median_nancheck)
5 0.000 0.000 0.001 0.000 _split.py:661(<listcomp>)
32 0.000 0.000 0.038 0.001 <__array_function__ internals>:2(average)
32 0.000 0.000 0.036 0.001 {method 'mean' of 'numpy.ndarray' objects}
30 0.000 0.000 0.001 0.000 validation.py:220(check_consistent_length)
32 0.000 0.000 0.000 0.000 {method 'copy' of 'numpy.ndarray' objects}
32 0.000 0.000 0.001 0.000 <__array_function__ internals>:2(broadcast_to)
15 0.000 0.000 0.000 0.000 fromnumeric.py:73(_wrapreduction)
5 0.000 0.000 0.001 0.000 validation.py:40(_assert_all_finite)
15 0.000 0.000 0.000 0.000 _split.py:277(__init__)
45 0.000 0.000 0.040 0.001 <__array_function__ internals>:2(unique)
32 0.000 0.000 0.001 0.000 stride_tricks.py:143(broadcast_to)
4 0.000 0.000 0.002 0.001 function_base.py:3359(_ureduce)
32 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(result_type)
32 0.000 0.000 0.000 0.000 <string>:1(__new__)
135 0.000 0.000 0.000 0.000 abc.py:137(__instancecheck__)
8 0.000 0.000 0.000 0.000 numeric.py:1273(normalize_axis_tuple)
32 0.000 0.000 0.000 0.000 {built-in method builtins.any}
4 0.000 0.000 0.000 0.000 numeric.py:1336(moveaxis)
130 0.000 0.000 0.000 0.000 abc.py:141(__subclasscheck__)
32 0.000 0.000 0.000 0.000 function_base.py:257(iterable)
269 0.000 0.000 0.000 0.000 {built-in method builtins.len}
5 0.000 0.000 0.000 0.000 validation.py:153(_shape_repr)
120 0.000 0.000 0.000 0.000 multiarray.py:312(where)
18 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(copyto)
32 0.000 0.000 0.000 0.000 {method 'conj' of 'numpy.ndarray' objects}
95 0.000 0.000 0.000 0.000 base.py:1189(isspmatrix)
45 0.000 0.000 0.000 0.000 arraysetops.py:138(_unpack_tuple)
5 0.000 0.000 0.000 0.000 _split.py:622(__init__)
5 0.000 0.000 0.000 0.000 warnings.py:474(__enter__)
32 0.000 0.000 0.000 0.000 {method 'squeeze' of 'numpy.ndarray' objects}
30 0.000 0.000 0.000 0.000 validation.py:231(<listcomp>)
10 0.000 0.000 0.000 0.000 numeric.py:290(full)
10 0.000 0.000 0.000 0.000 _split.py:423(__init__)
8 0.000 0.000 0.026 0.003 fromnumeric.py:978(argsort)
8 0.000 0.000 0.000 0.000 numeric.py:166(ones)
64 0.000 0.000 0.000 0.000 stride_tricks.py:121(<genexpr>)
32 0.000 0.000 0.000 0.000 stride_tricks.py:26(_maybe_view_as_subclass)
5 0.000 0.000 0.000 0.000 warnings.py:181(_add_filter)
4 0.000 0.000 0.000 0.000 {built-in method _bisect.bisect_left}
5 0.000 0.000 0.001 0.000 _split.py:685(split)
8 0.000 0.000 0.026 0.003 <__array_function__ internals>:2(argsort)
5 0.000 0.000 0.000 0.000 _internal.py:865(npy_ctypes_check)
5 0.000 0.000 0.000 0.000 fromnumeric.py:1648(ravel)
4 0.000 0.000 0.002 0.000 fromnumeric.py:657(partition)
10 0.000 0.000 0.000 0.000 validation.py:180(<genexpr>)
5 0.000 0.000 0.000 0.000 fromnumeric.py:2629(amin)
4 0.000 0.000 0.002 0.001 function_base.py:3419(median)
32 0.000 0.000 0.000 0.000 {built-in method builtins.iter}
10 0.000 0.000 0.000 0.000 {built-in method builtins.max}
5 0.000 0.000 0.000 0.000 warnings.py:453(__init__)
5 0.000 0.000 0.000 0.000 warnings.py:165(simplefilter)
32 0.000 0.000 0.000 0.000 function_base.py:2240(_cov_dispatcher)
5 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(nonzero)
5 0.000 0.000 0.000 0.000 fromnumeric.py:2189(any)
5 0.000 0.000 0.000 0.000 validation.py:771(column_or_1d)
5 0.000 0.000 0.000 0.000 {method 'remove' of 'list' objects}
15 0.000 0.000 0.000 0.000 fromnumeric.py:74(<dictcomp>)
32 0.000 0.000 0.000 0.000 function_base.py:289(_average_dispatcher)
5 0.000 0.000 0.001 0.000 fromnumeric.py:2358(cumsum)
4 0.000 0.000 0.002 0.001 <__array_function__ internals>:2(median)
5 0.000 0.000 0.000 0.000 {method 'ravel' of 'numpy.ndarray' objects}
13 0.000 0.000 0.000 0.000 {built-in method numpy.core._multiarray_umath.normalize_axis_index}
4 0.000 0.000 0.002 0.000 <__array_function__ internals>:2(partition)
5 0.000 0.000 0.001 0.000 <__array_function__ internals>:2(bincount)
5 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(concatenate)
4 0.000 0.000 0.000 0.000 core.py:6251(isMaskedArray)
5 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(any)
9 0.000 0.000 0.000 0.000 {method 'insert' of 'list' objects}
5 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects}
5 0.000 0.000 0.002 0.000 <__array_function__ internals>:2(cumsum)
5 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(diff)
4 0.000 0.000 0.000 0.000 {built-in method builtins.sorted}
5 0.000 0.000 0.000 0.000 fromnumeric.py:1759(nonzero)
5 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(amin)
32 0.000 0.000 0.000 0.000 stride_tricks.py:139(_broadcast_to_dispatcher)
45 0.000 0.000 0.000 0.000 arraysetops.py:146(_unique_dispatcher)
4 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(moveaxis)
5 0.000 0.000 0.000 0.000 _config.py:12(get_config)
5 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(shape)
5 0.000 0.000 0.000 0.000 multiclass.py:111(is_multilabel)
5 0.000 0.000 0.000 0.000 warnings.py:493(__exit__)
32 0.000 0.000 0.000 0.000 multiarray.py:635(result_type)
5 0.000 0.000 0.000 0.000 fromnumeric.py:2277(all)
5 0.000 0.000 0.000 0.000 validation.py:355(_ensure_no_complex_data)
5 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(all)
5 0.000 0.000 0.000 0.000 <__array_function__ internals>:2(ravel)
18 0.000 0.000 0.000 0.000 multiarray.py:1043(copyto)
8 0.000 0.000 0.000 0.000 numeric.py:1323(<listcomp>)
5 0.000 0.000 0.000 0.000 fromnumeric.py:1755(_nonzero_dispatcher)
4 0.000 0.000 0.000 0.000 {method 'transpose' of 'numpy.ndarray' objects}
5 0.000 0.000 0.000 0.000 {method 'copy' of 'dict' objects}
15 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects}
8 0.000 0.000 0.000 0.000 fromnumeric.py:974(_argsort_dispatcher)
1 0.000 0.000 0.000 0.000 _methods.py:32(_amin)
8 0.000 0.000 0.000 0.000 {built-in method _operator.index}
15 0.000 0.000 0.000 0.000 {built-in method _warnings._filters_mutated}
5 0.000 0.000 0.000 0.000 fromnumeric.py:1856(shape)
5 0.000 0.000 0.000 0.000 multiarray.py:145(concatenate)
4 0.000 0.000 0.000 0.000 function_base.py:3414(_median_dispatcher)
1 0.000 0.000 0.000 0.000 {method 'min' of 'numpy.ndarray' objects}
5 0.000 0.000 0.000 0.000 fromnumeric.py:2185(_any_dispatcher)
5 0.000 0.000 0.000 0.000 multiarray.py:853(bincount)
5 0.000 0.000 0.000 0.000 fromnumeric.py:1852(_shape_dispatcher)
5 0.000 0.000 0.000 0.000 fromnumeric.py:2354(_cumsum_dispatcher)
5 0.000 0.000 0.000 0.000 function_base.py:1143(_diff_dispatcher)
1 0.000 0.000 0.000 0.000 {method 'max' of 'numpy.ndarray' objects}
4 0.000 0.000 0.000 0.000 numeric.py:1399(<listcomp>)
5 0.000 0.000 0.000 0.000 fromnumeric.py:2273(_all_dispatcher)
5 0.000 0.000 0.000 0.000 fromnumeric.py:2624(_amin_dispatcher)
4 0.000 0.000 0.000 0.000 fromnumeric.py:653(_partition_dispatcher)
5 0.000 0.000 0.000 0.000 fromnumeric.py:1644(_ravel_dispatcher)
4 0.000 0.000 0.000 0.000 numeric.py:1332(_moveaxis_dispatcher)
1 0.000 0.000 0.000 0.000 _methods.py:28(_amax)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
推荐答案
NumPy 的最新版本支持 __array_function__
钩子,对象可以实现自定义任意 NumPy 可调用对象在被调用时执行的操作.支持在 1.16 中默认禁用,在 1.17 中默认启用,预计最终会无条件启用.
Recent versions of NumPy support an __array_function__
hook that objects can implement to customize what arbitrary NumPy callables do when called on them. Support is disabled by default in 1.16, enabled by default in 1.17, and expected to eventually be enabled unconditionally.
implement_array_function
是调用默认实现或 __array_function__
钩子的调度程序,以实现 __array_function__
支持.按照设计,它打算在每次调用公共 NumPy 可调用对象时调用一次,包括在 NumPy 中发生的调用,并且它必须进行大量方法查找.希望未来的优化工作能够减少部分开销.
implement_array_function
is the dispatcher that calls either a default implementation or an __array_function__
hook, to implement __array_function__
support. As designed, it is intended to be called once for literally every single call to a public NumPy callable, including calls happening within NumPy, and it has to do a lot of method lookups. Hopefully future optimization work will reduce some of this overhead.
您可以在 NEP 18 中查看更多详细信息,您可以使用 help(numpy.core._multiarray_umath.implement_array_function)
来检查函数的文档字符串:
You can see additional details in NEP 18, and you can check the function's docstring with help(numpy.core._multiarray_umath.implement_array_function)
:
Help on built-in function implement_array_function in module numpy.core._multiarray_umath:
implement_array_function(...)
Implement a function with checks for __array_function__ overrides.
All arguments are required, and can only be passed by position.
Arguments
---------
implementation : function
Function that implements the operation on NumPy array without
overrides when called like ``implementation(*args, **kwargs)``.
public_api : function
Function exposed by NumPy's public API originally called like
``public_api(*args, **kwargs)`` on which arguments are now being
checked.
relevant_args : iterable
Iterable of arguments to check for __array_function__ methods.
args : tuple
Arbitrary positional arguments originally passed into ``public_api``.
kwargs : dict
Arbitrary keyword arguments originally passed into ``public_api``.
Returns
-------
Result from calling ``implementation()`` or an ``__array_function__``
method, as appropriate.
Raises
------
TypeError : if no implementation is found.
这篇关于什么是 numpy.core._multiarray_umath.implement_array_function 以及为什么它花费大量时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!