什么是 numpy.core._multiarray_umath.implement_array_function 以及为什么它花费大量时间? [英] what is numpy.core._multiarray_umath.implement_array_function and why it costs lots of time?

查看:74
本文介绍了什么是 numpy.core._multiarray_umath.implement_array_function 以及为什么它花费大量时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 numpy 进行大规模数据分析,有很多矩阵实现(例如,dotcount_nonzerolinalg.svd).在 Jupyter notebook 中 %prun 之后,我发现 numpy.core._multiarray_umath.implement_array_function 花费了很多时间,38 秒,>250cumtime,带有大量 ncall (67139/66979).我知道应该优化其他功能,但我认为也可以抑制它,这是用来做什么的?

I used numpy to do large scale data analysis, with lots of matrix implementations (e.g., dot, count_nonzero, linalg.svd). After %prun in Jupyter notebook, I found that numpy.core._multiarray_umath.implement_array_function costs lots of time, 38 sec out of total 250 sec cumtime with large number of ncall (67139/66979). I know other functions should be optimized, but I think is it possible to suppress this as well, and what is this used for?

这是我的 %prun 输出:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 1848  203.845    0.110  242.582    0.131 stacking.py:130(_rda_cv)
 67139/66979   27.980    0.000   38.901    0.001 {built-in method numpy.core._multiarray_umath.implement_array_function}
    4    8.181    2.045  251.415   62.854 stacking.py:192(_model_selection)
14883    7.942    0.001    7.942    0.001 {method 'reduce' of 'numpy.ufunc' objects}
11096    2.107    0.000    2.353    0.000 linalg.py:1468(svd)
    4    0.154    0.038    0.188    0.047 stacking.py:20(_get_qvalues)
    1    0.149    0.149  251.887  251.887 stacking.py:255(fit)
   16    0.149    0.009    0.508    0.032 stacking.py:70(_construct_cov)
26341    0.140    0.000    0.140    0.000 {built-in method numpy.array}
    4    0.132    0.033    0.609    0.152 stacking.py:89(_construct_cov_cv)
11164    0.114    0.000    0.367    0.000 _methods.py:134(_mean)
 1919    0.102    0.000    0.102    0.000 {built-in method numpy.empty}
36989    0.073    0.000    0.073    0.000 {method 'astype' of 'numpy.ndarray' objects}
11132    0.052    0.000    0.383    0.000 fromnumeric.py:3153(mean)
   32    0.052    0.002    0.302    0.009 function_base.py:2245(cov)
38870    0.052    0.000   27.967    0.001 <__array_function__ internals>:2(dot)
11164    0.051    0.000    0.054    0.000 _methods.py:50(_count_reduce_items)
11096    0.043    0.000    0.070    0.000 linalg.py:144(_commonType)
   13    0.036    0.003    0.036    0.003 {method 'argsort' of 'numpy.ndarray' objects}
 3696    0.035    0.000    7.909    0.002 numeric.py:409(count_nonzero)
11096    0.033    0.000    0.064    0.000 linalg.py:116(_makearray)
66728    0.031    0.000    0.031    0.000 {built-in method builtins.issubclass}
11096    0.027    0.000    2.407    0.000 <__array_function__ internals>:2(svd)
11145    0.026    0.000    0.026    0.000 {method 'flatten' of 'numpy.ndarray' objects}
11096    0.024    0.000    0.024    0.000 linalg.py:111(get_linalg_error_extobj)
348583    0.023    0.000    0.023    0.000 {method 'append' of 'list' objects}
11132    0.021    0.000    0.421    0.000 <__array_function__ internals>:2(mean)
 7408    0.018    0.000    0.034    0.000 numerictypes.py:293(issubclass_)
 3696    0.017    0.000    7.940    0.002 <__array_function__ internals>:2(count_nonzero)
 3704    0.017    0.000    0.053    0.000 numerictypes.py:365(issubdtype)
 5544    0.017    0.000    0.017    0.000 stacking.py:146(<dictcomp>)
22192    0.016    0.000    0.025    0.000 linalg.py:134(_realType)
   40    0.016    0.000    0.016    0.000 {method 'sort' of 'numpy.ndarray' objects}
 3702    0.013    0.000    7.795    0.002 {method 'sum' of 'numpy.ndarray' objects}
15009    0.012    0.000    0.028    0.000 _asarray.py:88(asanyarray)
    5    0.012    0.002    0.053    0.011 _split.py:628(_make_test_folds)
22192    0.010    0.000    0.013    0.000 linalg.py:121(isComplexType)
22602    0.010    0.000    0.010    0.000 {built-in method builtins.isinstance}
13199    0.010    0.000    0.010    0.000 {built-in method builtins.getattr}
11264    0.010    0.000    0.025    0.000 _asarray.py:16(asarray)
11096    0.009    0.000    0.009    0.000 linalg.py:203(_assertRankAtLeast2)
22196    0.009    0.000    0.009    0.000 {method 'get' of 'dict' objects}
 1964    0.009    0.000    0.009    0.000 {method 'argmax' of 'numpy.ndarray' objects}
11132    0.008    0.000    0.008    0.000 {built-in method __new__ of type object at 0x00007FF847CE9BA0}
38870    0.008    0.000    0.008    0.000 multiarray.py:707(dot)
11625    0.008    0.000    0.008    0.000 {built-in method builtins.hasattr}
   45    0.007    0.000    0.038    0.001 arraysetops.py:297(_unique1d)
60/20    0.006    0.000    0.059    0.003 _split.py:74(split)
 1964    0.006    0.000    0.034    0.000 <__array_function__ internals>:2(argmax)
 1964    0.006    0.000    0.023    0.000 fromnumeric.py:1091(argmax)
 3702    0.005    0.000    7.782    0.002 _methods.py:36(_sum)
    4    0.005    0.001    0.221    0.055 stacking.py:317(_normalizer)
 1982    0.004    0.000    0.044    0.000 fromnumeric.py:55(_wrapfunc)
22192    0.004    0.000    0.004    0.000 {method '__array_prepare__' of 'numpy.ndarray' objects}
11096    0.004    0.000    0.004    0.000 linalg.py:1464(_svd_dispatcher)
   40    0.003    0.000    0.004    0.000 _split.py:107(_iter_test_masks)
11132    0.003    0.000    0.003    0.000 fromnumeric.py:3149(_mean_dispatcher)
 3696    0.003    0.000    0.003    0.000 numeric.py:405(_count_nonzero_dispatcher)
    3    0.003    0.001    0.005    0.002 stacking.py:243(_rda_prediction)
   20    0.002    0.000    0.055    0.003 _split.py:680(_iter_test_masks)
    1    0.002    0.002  251.889  251.889 <string>:1(<module>)
   48    0.002    0.000    0.002    0.000 {built-in method numpy.zeros}
   25    0.002    0.000    0.002    0.000 {built-in method numpy.arange}
    4    0.001    0.000    0.001    0.000 {method 'partition' of 'numpy.ndarray' objects}
    5    0.001    0.000    0.001    0.000 {method 'cumsum' of 'numpy.ndarray' objects}
   45    0.001    0.000    0.039    0.001 arraysetops.py:151(unique)
 1964    0.001    0.000    0.001    0.000 fromnumeric.py:1087(_argmax_dispatcher)
    5    0.001    0.000    0.011    0.002 multiclass.py:174(type_of_target)
  116    0.001    0.000    0.002    0.000 fromnumeric.py:42(_wrapit)
   32    0.001    0.000    0.001    0.000 stride_tricks.py:116(_broadcast_to)
   32    0.000    0.000    0.038    0.001 function_base.py:293(average)
    4    0.000    0.000    0.001    0.000 stacking.py:107(_calculate_weights)
  120    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(where)
  115    0.000    0.000    0.001    0.000 validation.py:127(_num_samples)
   40    0.000    0.000    0.001    0.000 _split.py:430(_iter_test_indices)
  135    0.000    0.000    0.000    0.000 {built-in method _abc._abc_instancecheck}
60/20    0.000    0.000    0.060    0.003 _split.py:299(split)
   30    0.000    0.000    0.001    0.000 validation.py:238(indexable)
    5    0.000    0.000    0.001    0.000 validation.py:362(check_array)
    1    0.000    0.000  251.889  251.889 {built-in method builtins.exec}
    5    0.000    0.000    0.000    0.000 {method 'nonzero' of 'numpy.ndarray' objects}
    4    0.000    0.000    0.002    0.001 function_base.py:3508(_median)
  130    0.000    0.000    0.000    0.000 {built-in method _abc._abc_subclasscheck}
    5    0.000    0.000    0.000    0.000 function_base.py:1147(diff)
    1    0.000    0.000    0.003    0.003 stacking.py:350(_check_y)
   32    0.000    0.000    0.321    0.010 <__array_function__ internals>:2(cov)
    4    0.000    0.000    0.000    0.000 utils.py:1142(_median_nancheck)
    5    0.000    0.000    0.001    0.000 _split.py:661(<listcomp>)
   32    0.000    0.000    0.038    0.001 <__array_function__ internals>:2(average)
   32    0.000    0.000    0.036    0.001 {method 'mean' of 'numpy.ndarray' objects}
   30    0.000    0.000    0.001    0.000 validation.py:220(check_consistent_length)
   32    0.000    0.000    0.000    0.000 {method 'copy' of 'numpy.ndarray' objects}
   32    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(broadcast_to)
   15    0.000    0.000    0.000    0.000 fromnumeric.py:73(_wrapreduction)
    5    0.000    0.000    0.001    0.000 validation.py:40(_assert_all_finite)
   15    0.000    0.000    0.000    0.000 _split.py:277(__init__)
   45    0.000    0.000    0.040    0.001 <__array_function__ internals>:2(unique)
   32    0.000    0.000    0.001    0.000 stride_tricks.py:143(broadcast_to)
    4    0.000    0.000    0.002    0.001 function_base.py:3359(_ureduce)
   32    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(result_type)
   32    0.000    0.000    0.000    0.000 <string>:1(__new__)
  135    0.000    0.000    0.000    0.000 abc.py:137(__instancecheck__)
    8    0.000    0.000    0.000    0.000 numeric.py:1273(normalize_axis_tuple)
   32    0.000    0.000    0.000    0.000 {built-in method builtins.any}
    4    0.000    0.000    0.000    0.000 numeric.py:1336(moveaxis)
  130    0.000    0.000    0.000    0.000 abc.py:141(__subclasscheck__)
   32    0.000    0.000    0.000    0.000 function_base.py:257(iterable)
  269    0.000    0.000    0.000    0.000 {built-in method builtins.len}
    5    0.000    0.000    0.000    0.000 validation.py:153(_shape_repr)
  120    0.000    0.000    0.000    0.000 multiarray.py:312(where)
   18    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(copyto)
   32    0.000    0.000    0.000    0.000 {method 'conj' of 'numpy.ndarray' objects}
   95    0.000    0.000    0.000    0.000 base.py:1189(isspmatrix)
   45    0.000    0.000    0.000    0.000 arraysetops.py:138(_unpack_tuple)
    5    0.000    0.000    0.000    0.000 _split.py:622(__init__)
    5    0.000    0.000    0.000    0.000 warnings.py:474(__enter__)
   32    0.000    0.000    0.000    0.000 {method 'squeeze' of 'numpy.ndarray' objects}
   30    0.000    0.000    0.000    0.000 validation.py:231(<listcomp>)
   10    0.000    0.000    0.000    0.000 numeric.py:290(full)
   10    0.000    0.000    0.000    0.000 _split.py:423(__init__)
    8    0.000    0.000    0.026    0.003 fromnumeric.py:978(argsort)
    8    0.000    0.000    0.000    0.000 numeric.py:166(ones)
   64    0.000    0.000    0.000    0.000 stride_tricks.py:121(<genexpr>)
   32    0.000    0.000    0.000    0.000 stride_tricks.py:26(_maybe_view_as_subclass)
    5    0.000    0.000    0.000    0.000 warnings.py:181(_add_filter)
    4    0.000    0.000    0.000    0.000 {built-in method _bisect.bisect_left}
    5    0.000    0.000    0.001    0.000 _split.py:685(split)
    8    0.000    0.000    0.026    0.003 <__array_function__ internals>:2(argsort)
    5    0.000    0.000    0.000    0.000 _internal.py:865(npy_ctypes_check)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1648(ravel)
    4    0.000    0.000    0.002    0.000 fromnumeric.py:657(partition)
   10    0.000    0.000    0.000    0.000 validation.py:180(<genexpr>)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2629(amin)
    4    0.000    0.000    0.002    0.001 function_base.py:3419(median)
   32    0.000    0.000    0.000    0.000 {built-in method builtins.iter}
   10    0.000    0.000    0.000    0.000 {built-in method builtins.max}
    5    0.000    0.000    0.000    0.000 warnings.py:453(__init__)
    5    0.000    0.000    0.000    0.000 warnings.py:165(simplefilter)
   32    0.000    0.000    0.000    0.000 function_base.py:2240(_cov_dispatcher)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(nonzero)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2189(any)
    5    0.000    0.000    0.000    0.000 validation.py:771(column_or_1d)
    5    0.000    0.000    0.000    0.000 {method 'remove' of 'list' objects}
   15    0.000    0.000    0.000    0.000 fromnumeric.py:74(<dictcomp>)
   32    0.000    0.000    0.000    0.000 function_base.py:289(_average_dispatcher)
    5    0.000    0.000    0.001    0.000 fromnumeric.py:2358(cumsum)
    4    0.000    0.000    0.002    0.001 <__array_function__ internals>:2(median)
    5    0.000    0.000    0.000    0.000 {method 'ravel' of 'numpy.ndarray' objects}
   13    0.000    0.000    0.000    0.000 {built-in method numpy.core._multiarray_umath.normalize_axis_index}
    4    0.000    0.000    0.002    0.000 <__array_function__ internals>:2(partition)
    5    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(bincount)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(concatenate)
    4    0.000    0.000    0.000    0.000 core.py:6251(isMaskedArray)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(any)
    9    0.000    0.000    0.000    0.000 {method 'insert' of 'list' objects}
    5    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
    5    0.000    0.000    0.002    0.000 <__array_function__ internals>:2(cumsum)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(diff)
    4    0.000    0.000    0.000    0.000 {built-in method builtins.sorted}
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1759(nonzero)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(amin)
   32    0.000    0.000    0.000    0.000 stride_tricks.py:139(_broadcast_to_dispatcher)
   45    0.000    0.000    0.000    0.000 arraysetops.py:146(_unique_dispatcher)
    4    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(moveaxis)
    5    0.000    0.000    0.000    0.000 _config.py:12(get_config)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(shape)
    5    0.000    0.000    0.000    0.000 multiclass.py:111(is_multilabel)
    5    0.000    0.000    0.000    0.000 warnings.py:493(__exit__)
   32    0.000    0.000    0.000    0.000 multiarray.py:635(result_type)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2277(all)
    5    0.000    0.000    0.000    0.000 validation.py:355(_ensure_no_complex_data)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(all)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(ravel)
   18    0.000    0.000    0.000    0.000 multiarray.py:1043(copyto)
    8    0.000    0.000    0.000    0.000 numeric.py:1323(<listcomp>)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1755(_nonzero_dispatcher)
    4    0.000    0.000    0.000    0.000 {method 'transpose' of 'numpy.ndarray' objects}
    5    0.000    0.000    0.000    0.000 {method 'copy' of 'dict' objects}
   15    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
    8    0.000    0.000    0.000    0.000 fromnumeric.py:974(_argsort_dispatcher)
    1    0.000    0.000    0.000    0.000 _methods.py:32(_amin)
    8    0.000    0.000    0.000    0.000 {built-in method _operator.index}
   15    0.000    0.000    0.000    0.000 {built-in method _warnings._filters_mutated}
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1856(shape)
    5    0.000    0.000    0.000    0.000 multiarray.py:145(concatenate)
    4    0.000    0.000    0.000    0.000 function_base.py:3414(_median_dispatcher)
    1    0.000    0.000    0.000    0.000 {method 'min' of 'numpy.ndarray' objects}
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2185(_any_dispatcher)
    5    0.000    0.000    0.000    0.000 multiarray.py:853(bincount)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1852(_shape_dispatcher)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2354(_cumsum_dispatcher)
    5    0.000    0.000    0.000    0.000 function_base.py:1143(_diff_dispatcher)
    1    0.000    0.000    0.000    0.000 {method 'max' of 'numpy.ndarray' objects}
    4    0.000    0.000    0.000    0.000 numeric.py:1399(<listcomp>)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2273(_all_dispatcher)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2624(_amin_dispatcher)
    4    0.000    0.000    0.000    0.000 fromnumeric.py:653(_partition_dispatcher)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1644(_ravel_dispatcher)
    4    0.000    0.000    0.000    0.000 numeric.py:1332(_moveaxis_dispatcher)
    1    0.000    0.000    0.000    0.000 _methods.py:28(_amax)
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

推荐答案

NumPy 的最新版本支持 __array_function__ 钩子,对象可以实现自定义任意 NumPy 可调用对象在被调用时执行的操作.支持在 1.16 中默认禁用,在 1.17 中默认启用,预计最终会无条件启用.

Recent versions of NumPy support an __array_function__ hook that objects can implement to customize what arbitrary NumPy callables do when called on them. Support is disabled by default in 1.16, enabled by default in 1.17, and expected to eventually be enabled unconditionally.

implement_array_function 是调用默认实现或 __array_function__ 钩子的调度程序,以实现 __array_function__ 支持.按照设计,它打算在每次调用公共 NumPy 可调用对象时调用一次,包括在 NumPy 中发生的调用,并且它必须进行大量方法查找.希望未来的优化工作能够减少部分开销.

implement_array_function is the dispatcher that calls either a default implementation or an __array_function__ hook, to implement __array_function__ support. As designed, it is intended to be called once for literally every single call to a public NumPy callable, including calls happening within NumPy, and it has to do a lot of method lookups. Hopefully future optimization work will reduce some of this overhead.

您可以在 NEP 18 中查看更多详细信息,您可以使用 help(numpy.core._multiarray_umath.implement_array_function) 来检查函数的文档字符串:

You can see additional details in NEP 18, and you can check the function's docstring with help(numpy.core._multiarray_umath.implement_array_function):

Help on built-in function implement_array_function in module numpy.core._multiarray_umath:

implement_array_function(...)
    Implement a function with checks for __array_function__ overrides.

    All arguments are required, and can only be passed by position.

    Arguments
    ---------
    implementation : function
        Function that implements the operation on NumPy array without
        overrides when called like ``implementation(*args, **kwargs)``.
    public_api : function
        Function exposed by NumPy's public API originally called like
        ``public_api(*args, **kwargs)`` on which arguments are now being
        checked.
    relevant_args : iterable
        Iterable of arguments to check for __array_function__ methods.
    args : tuple
        Arbitrary positional arguments originally passed into ``public_api``.
    kwargs : dict
        Arbitrary keyword arguments originally passed into ``public_api``.

    Returns
    -------
    Result from calling ``implementation()`` or an ``__array_function__``
    method, as appropriate.

    Raises
    ------
    TypeError : if no implementation is found.

这篇关于什么是 numpy.core._multiarray_umath.implement_array_function 以及为什么它花费大量时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆