将数千个图像读入一个大的numpy数组的最快方法 [英] Fastest approach to read thousands of images into one big numpy array

查看：112 发布时间：2018/7/24 16:52:46 python image performance numpy

本文介绍了将数千个图像读入一个大的numpy数组的最快方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试找到从目录中读取一堆图像到numpy数组的最快方法。我的最终目标是计算所有这些图像中像素的最大值，最小值和第n百分位数等统计数据。当所有图像中的像素都在一个大的numpy数组中时，这是直截了当的，因为我可以使用内置数组方法，例如 .max 和 .min ，以及 np.percentile 函数。

I'm trying to find the fastest approach to read a bunch of images from a directory into a numpy array. My end goal is to compute statistics such as the max, min, and nth percentile of the pixels from all these images. This is straightforward and fast when the pixels from all the images are in one big numpy array, since I can use the inbuilt array methods such as .max and .min, and the np.percentile function.

以下是一些具有25个tiff图像（512x512像素）的示例时序。这些基准是在jupyter-notebook中使用 %% timit 。差异太小，不足以对25张图片产生任何实际影响，但我打算将来阅读数千张图片。

Below are a few example timings with 25 tiff-images (512x512 pixels). These benchmarks are from using %%timit in a jupyter-notebook. The differences are too small to have any practical implications for just 25 images, but I am intending to read thousands of images in the future.

# Imports
import os
import skimage.io as io
import numpy as np

追加到列表

Appending to a list

%%timeit
imgs = []    
img_path = '/path/to/imgs/'
for img in os.listdir(img_path):    
    imgs.append(io.imread(os.path.join(img_path, img)))    
## 32.2 ms ± 355 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

使用字典

Using a dictionary

%%timeit    
imgs = {}    
img_path = '/path/to/imgs/'    
for img in os.listdir(img_path):    
    imgs[num] = io.imread(os.path.join(img_path, img))    
## 33.3 ms ± 402 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

上面的列表和字典方法，我尝试用时间相似的结果替换循环。我也试过预先分配字典键，所用时间没有显着差异。要将列表中的图像转换为大数组，我会使用 np.concatenate（imgs），这只需要大约1毫秒。

For the list and dictionary approaches above, I tried replacing the loop with a the respective comprehension with similar results time-wise. I also tried preallocating the dictionary keys with no significant difference in the time taken. To get the images from a list to a big array, I would use np.concatenate(imgs), which only takes ~1 ms.

沿第一维预先分配一个numpy数组

Preallocating a numpy array along the first dimension

%%timeit    
imgs = np.ndarray((512*25,512), dtype='uint16')    
img_path = '/path/to/imgs/'    
for num, img in enumerate(os.listdir(img_path)):    
    imgs[num*512:(num+1)*512, :] = io.imread(os.path.join(img_path, img))    
## 33.5 ms ± 804 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

沿第三维预先分配numpy

Preallocating a numpy along the third dimension

%%timeit    
imgs = np.ndarray((512,512,25), dtype='uint16')    
img_path = '/path/to/imgs/'    
for num, img in enumerate(os.listdir(img_path)):    
    imgs[:, :, num] = io.imread(os.path.join(img_path, img))    
## 71.2 ms ± 2.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

我最初认为numpy preallocation方法会更快，因为循环中没有动态变量扩展，但似乎并非如此。我发现最直观的方法是最后一个，其中每个图像沿阵列的第三轴占据一个单独的尺寸，但这也是最慢的。所需的额外时间不是由于预分配本身，只需要约1毫秒。

I initially thought the numpy preallocation approaches would be faster, since there is no dynamic variable expansion in the loop, but this does not seem to be the case. The approach that I find the most intuitive is the last one, where each image occupies a separate dimensions along the third axis of the array, but this is also the slowest. The additional time taken is not due to the preallocation itself, which only takes ~ 1 ms.

我有三个问题：

为什么numpy预分配方法的速度不比字典和列表解决方案快？

这是读取数千张图像的最快方法进入一个大的numpy阵列？

我可以从外观numpy和scikit-image中获益，还能获得更快的图像读取模块吗？我试过 plt.imread（），但是 scikit-image.io 模块更快。

Why is the numpy preallocation approaches not faster than the dictionary and list solutions?
Which is the fastest way to read in thousands of images into one big numpy array?
Could I benefit from looking outside numpy and scikit-image, for an even faster module for reading in images? I tried plt.imread(), but the scikit-image.io module is faster.

推荐答案

A部分：访问和分配NumPy数组

按照NumPy数组的行主要顺序存储元素的方式，每次迭代时沿着最后一个轴存储这些元素时，你做的是正确的。这些将占用连续的内存位置，因此对于访问和分配值最有效。因此初始化如 np.ndarray（（512 * 25,512），dtype ='uint16'）或 np.ndarray（（25,512,512），dtype =' uint16'）将在评论中提到最佳效果。

Going by the way elements are stored in row-major order for NumPy arrays, you are doing the right thing when storing those elements along the last axis per iteration. These would occupy contiguous memory locations and as such would be the most efficient for accessing and assigning values into. Thus initializations like np.ndarray((512*25,512), dtype='uint16') or np.ndarray((25,512,512), dtype='uint16') would work the best as also mentioned in the comments.

将这些作为funcs进行编译以测试时间并以随机数组进行测量代替图像 -

After compiling those as funcs for testing on timings and feeding in random arrays instead of images -

N = 512
n = 25
a = np.random.randint(0,255,(N,N))

def app1():
    imgs = np.empty((N,N,n), dtype='uint16')
    for i in range(n):
        imgs[:,:,i] = a
        # Storing along the first two axes
    return imgs

def app2():
    imgs = np.empty((N*n,N), dtype='uint16')
    for num in range(n):    
        imgs[num*N:(num+1)*N, :] = a
        # Storing along the last axis
    return imgs

def app3():
    imgs = np.empty((n,N,N), dtype='uint16')
    for num in range(n):    
        imgs[num,:,:] = a
        # Storing along the last two axes
    return imgs

def app4():
    imgs = np.empty((N,n,N), dtype='uint16')
    for num in range(n):    
        imgs[:,num,:] = a
        # Storing along the first and last axes
    return imgs

时间 -

In [45]: %timeit app1()
    ...: %timeit app2()
    ...: %timeit app3()
    ...: %timeit app4()
    ...: 
10 loops, best of 3: 28.2 ms per loop
100 loops, best of 3: 2.04 ms per loop
100 loops, best of 3: 2.02 ms per loop
100 loops, best of 3: 2.36 ms per loop

这些时间确认了性能理论在一开始，虽然我预计最后一次设置的时间在 app3 和 app1 之间有时间，但也许从最后一个轴到第一个轴进行访问和分配的效果不是线性的。对此问题的更多调查可能很有意思（在这里跟进问题）。

Those timings confirm the performance theory proposed at the start, though I expected the timings for the last setup to have timings in between the ones for app3 and app1, but maybe the effect of going from last to the first axis for accessing and assigning isn't linear. More investigations on this one could be interesting (follow up question here).

要示意地表示，请考虑我们存储图像数组，用<$ c $表示c> x （图片1）和 o （图片2），我们有：

To claify schematically, consider that we are storing image arrays, denoted by x (image 1) and o (image 2), we would have :

App1：

[[[x 0]
  [x 0]
  [x 0]
  [x 0]
  [x 0]]

 [[x 0]
  [x 0]
  [x 0]
  [x 0]
  [x 0]]

 [[x 0]
  [x 0]
  [x 0]
  [x 0]
  [x 0]]]

因此，在内存空间中，它将是： [x，o，x，o，x，o ..] 跟随行主要订单。

Thus, in memory space, it would be : [x,o,x,o,x,o..] following row-major order.

App2 ：

[[x x x x x]
 [x x x x x]
 [x x x x x]
 [o o o o o]
 [o o o o o]
 [o o o o o]]

因此，在内存空间中，这将是： [x，x，x，x，x，x ... o，o，o，o，o ..] 。

Thus, in memory space, it would be : [x,x,x,x,x,x...o,o,o,o,o..].

App3：

[[[x x x x x]
  [x x x x x]
  [x x x x x]]

 [[o o o o o]
  [o o o o o]
  [o o o o o]]]

因此，在内存空间中，它与前一个相同。

Thus, in memory space, it would be same as previous one.

B部分：从磁盘读取图像作为数组

现在，阅读图像的部分，我看到OpenCV的 imread 要快得多。

Now, the part on reading image, I have seen OpenCV's imread to be much faster.

作为测试，我从维基页面下载了蒙娜丽莎的图像并测试了图像读取的性能 -

As a test, I downloaded Mona Lisa's image from wiki page and tested performance on image reading -

import cv2 # OpenCV

In [521]: %timeit io.imread('monalisa.jpg')
100 loops, best of 3: 3.24 ms per loop

In [522]: %timeit cv2.imread('monalisa.jpg')
100 loops, best of 3: 2.54 ms per loop

这篇关于将数千个图像读入一个大的numpy数组的最快方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将数千个图像读入一个大的numpy数组的最快方法 [英] Fastest approach to read thousands of images into one big numpy array

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将数千个图像读入一个大的numpy数组的最快方法 [英] Fastest approach to read thousands of images into one big numpy array

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭