处理大型 Numpy 数组的技术? [英] Techniques for working with large Numpy arrays?

查看:26
本文介绍了处理大型 Numpy 数组的技术?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有时您必须对一个或多个大型 Numpy 数组执行许多中间操作.这会很快导致 MemoryError.到目前为止,在我的研究中,您发现 Pickling(Pickle、CPickle、Pytables 等)和 gc.collect() 是缓解这种情况的方法.我想知道有经验的程序员在处理大量数据时是否还使用其他任何技术(当然,除了删除策略/代码中的冗余).

There are times when you have to perform many intermediate operations on one, or more, large Numpy arrays. This can quickly result in MemoryErrors. In my research so far, U have found that Pickling (Pickle, CPickle, Pytables, etc.) and gc.collect() are ways to mitigate this. I was wondering if there are any other techniques experienced programmers use when dealing with large quantities of data (other than removing redundancies in your strategy/code, of course).

另外,如果有一件事我可以肯定的是,没有什么是免费的.使用其中的一些技术,有哪些权衡(即速度、鲁棒性等)?

Also, if there's one thing I'm sure of is that nothing is free. With some of these techniques, what are the trade-offs (i.e., speed, robustness, etc.)?

推荐答案

我感受到了你的痛苦...你有时最终会存储数倍于数组大小的值,这些值稍后将被丢弃.一次处理数组中的一项时,这无关紧要,但在矢量化时可能会杀死您.

I feel your pain... You sometimes end up storing several times the size of your array in values you will later discard. When processing one item in your array at a time, this is irrelevant, but can kill you when vectorizing.

我将使用工作中的示例进行说明.我最近使用 numpy 对此处描述的算法进行了编码.它是一种颜色映射算法,它采用 RGB 图像,并将其转换为 CMYK 图像.对每个像素重复的过程如下:

I'll use an example from work for illustration purposes. I recently coded the algorithm described here using numpy. It is a color map algorithm, which takes an RGB image, and converts it into a CMYK image. The process, which is repeated for every pixel, is as follows:

  1. 使用每个 RGB 值的最高 4 位作为三维查找表的索引.这决定了 LUT 中立方体的 8 个顶点的 CMYK 值.
  2. 根据上一步中的顶点值,使用每个 RGB 值的最低有效 4 位在该立方体内进行插值.执行此操作的最有效方法需要计算 16 个 uint8 数组,即正在处理的图像的大小.对于 24 位 RGB 图像,相当于需要存储图像的 x6 倍来处理它.

你可以做一些事情来处理这个:

A couple of things you can do to handle this:

也许您无法一次性处理 1,000x1,000 的数组.但是如果你可以用一个 python for 循环迭代 10 个 100x1,000 的数组来做到这一点,它仍然会以非常远的优势击败超过 1,000,000 个项目的 python 迭代器!是的,它会变慢,但不会那么快.

Maybe you cannot process a 1,000x1,000 array in a single pass. But if you can do it with a python for loop iterating over 10 arrays of 100x1,000, it is still going to beat by a very far margin a python iterator over 1,000,000 items! It´s going to be slower, yes, but not as much.

这与我上面的插值示例直接相关,很难遇到,但值得密切关注.因为我是在每个维度有 4 位的 3 维立方体上进行插值,所以只有 16x16x16 种可能的结果,可以存储在 16 个 16x16x16 字节的数组中.所以我可以预先计算它们并使用 64KB 的内存存储它们,并为整个图像逐一查找值,而不是以巨大的内存成本为每个像素重新执行相同的操作.这对于小至 64x64 像素的图像来说已经得到了回报,并且基本上允许处理像素数量为 x6 倍的图像,而无需细分阵列.

This relates directly to my interpolation example above, and is harder to come across, although worth keeping an eye open for it. Because I am interpolating on a three-dimensional cube with 4 bits in each dimension, there are only 16x16x16 possible outcomes, which can be stored in 16 arrays of 16x16x16 bytes. So I can precompute them and store them using 64KB of memory, and look-up the values one by one for the whole image, rather than redoing the same operations for every pixel at huge memory cost. This already pays-off for images as small as 64x64 pixels, and basically allows processing images with x6 times the amount of pixels without having to subdivide the array.

如果您的中间值可以放入单个 uint8,请不要使用 int32 数组!由于静默溢出,这可能会变成神秘错误的噩梦,但如果您小心谨慎,它可以大大节省资源.

If your intermediate values can fit in a single uint8, don't use an array of int32s! This can turn into a nightmare of mysterious errors due to silent overflows, but if you are careful, it can provide a big saving of resources.

这篇关于处理大型 Numpy 数组的技术?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆