ndarray是否比recarray访问更快? [英] is ndarray faster than recarray access?

查看:194
本文介绍了ndarray是否比recarray访问更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我能够将我的recarray数据复制到ndarray,进行一些计算并返回具有更新值的ndarray.

I was able to copy my recarray data to a ndarray, do some calculations and return the ndarray with updated values.

然后,我发现了numpy.lib.recfunctions中的append_fields()功能,并认为将2个字段简单地附加到原始Recarray中以保存我的计算值会更聪明.

Then, I discovered the append_fields() capability in numpy.lib.recfunctions, and thought it would be a lot smarter to simply append 2 fields to my original recarray to hold my calculated values.

当我这样做时,我发现操作速度要慢得多.我不必花时间,与基于Recarray的一分钟+相比,基于ndarray的过程要花费几秒钟,并且我的测试数组很小,小于10,000行.

When I did this, I found the operation was much, much slower. I didn't have to time it, the ndarray based process takes a few seconds compared to a minute+ with recarray and my test arrays are small, <10,000 rows.

这是典型的吗? ndarray访问比recarray快得多?我期望由于按字段名称进行访问而导致性能下降,但并没有那么严重.

Is this typical? ndarray access is much faster than recarray? I expected some performance degradation due to access by field name, but not this much.

推荐答案

更新日期为2018年11月15日
我扩展了时序测试,以弄清ndarray,结构化数组,recarray和masked数组(记录数组的类型)的性能差异.每个都有细微的差异.请参阅此处的讨论:
numpy-discussion:结构化-数组-记录数组和记录数组

Updated 15-November-2018
I expanded my timing tests to clarify differences in performance for ndarray, structured array, recarray and masked array (type of record array?). There are subtle differences in each. See discussion here:
numpy-discussion:structured-arrays-recarrays-and-record-arrays

这是我的性能测试的结果.我建立了一个非常简单的示例(使用我的HDF5数据集之一)来比较性能与存储在4种类型的数组中的相同数据:ndarray,结构化数组,recarray和masked数组.构造数组后,将它们传递给一个函数,该函数仅循环遍历每一行并从每一行提取12个值.从timeit函数通过单次调用(number = 1)调用这些函数.此测试仅测量数组读取功能,并避免所有其他计算.
以下给出9,000行的结果:

Here are result of my performance tests. I built a very simple example (using 1 of my HDF5 data sets) to compare performance with the same data stored in 4 types of arrays: ndarray, structured array, recarray and masked array. After the arrays are constructed, they are passed to a function that simply loops thru each row and extracts 12 values from each row. The functions are called from the timeit function with a single pass (number=1). This test only measures the array read function, and avoids all other calculations.
Results given below for 9,000 rows:

for ndarray: 0.034137165047070615
for structured array: 0.1306827116913577
for recarray: 0.446010040784266
for masked array: 31.33269560998199

基于此测试,每种类型的访问性能都会下降.结构化数组和Recarray的访问时间比ndarray的访问时间慢4到13倍(但都只有几分之一秒).但是,ndarray访问比屏蔽数组访问快1000倍.这就解释了我在完整示例中看到的秒到分钟的差异.希望这些数据对遇到此问题的其他人有用.

Based on this test, access performance decreases with each type. Access times for structured array and recarray are 4x-13x slower than ndarray access (but all are only a fraction of second). However, ndarray access is 1000x faster than masked array access. That explains the seconds to minutes difference I see in my complete example. Hopefully this data is useful to others that encounter this issue.

这篇关于ndarray是否比recarray访问更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆