在python中仅加载图像的一部分 [英] Load just part of an image in python

查看:95
本文介绍了在python中仅加载图像的一部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能是一个愚蠢的问题,但是...

我有数千张图像要加载到Python中,然后转换为numpy数组.显然,这进行得有点慢.但是,我实际上只对每个图像的一小部分感兴趣. (同一部分,图像中心仅100x100像素.)

有什么方法可以只加载图像的一部分以使处理速度更快吗?

这是一些示例代码,在这些代码中,我生成一些示例图像,保存它们,然后将它们加载回.

import numpy as np
import matplotlib.pyplot as plt
import Image, time

#Generate sample images
num_images = 5

for i in range(0,num_images):
    Z = np.random.rand(2000,2000)
    print 'saving %i'%i
    plt.imsave('%03i.png'%i,Z)

%load the images
for i in range(0,num_images):
    t = time.time()

    im = Image.open('%03i.png'%i)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))

    print 'Time to open: %.4f seconds'%(time.time()-t)

    #convert them to numpy arrays
    data = np.array(imc)

解决方案

将文件另存为未压缩的24位BMP.这些以非常规则的方式存储像素数据.从 Wikipedia 中查看该图的图像数据"部分.请注意,图中的大多数复杂性仅来自标题:

例如,假设您正在存储此图像(此处显示为放大):

如果像素数据部分存储为24位未压缩的BMP,则其外观如下.请注意,由于某种原因,数据是以自下而上的方式存储的,并且是以BGR格式而不是RGB格式存储的,因此文件中的第一行是图像的最底行,第二行是倒数第二行,等等:

00 00 FF    FF FF FF    00 00
FF 00 00    00 FF 00    00 00

该数据的解释如下:

           |  First column  |  Second Column  |  Padding
-----------+----------------+-----------------+-----------
Second Row |  00 00 FF      |  FF FF FF       |  00 00
-----------+----------------+-----------------+-----------
First Row  |  FF 00 00      |  00 FF 00       |  00 00
-----------+----------------+-----------------+-----------

或:

           |  First column  |  Second Column  |  Padding
-----------+----------------+-----------------+-----------
Second Row |  red           |  white          |  00 00
-----------+----------------+-----------------+-----------
First Row  |  blue          |  green          |  00 00
-----------+----------------+-----------------+-----------

在那里有填充,可将行大小填充为4字节的倍数.


因此,您所要做的就是为这种特定的文件格式实现一个读取器,然后计算必须开始和停止读取每一行的位置的字节偏移量:

def calc_bytes_per_row(width, bytes_per_pixel):
    res = width * bytes_per_pixel
    if res % 4 != 0:
        res += 4 - res % 4
    return res

def calc_row_offsets(pixel_array_offset, bmp_width, bmp_height, x, y, row_width):
    if x + row_width > bmp_width:
        raise ValueError("This is only for calculating offsets within a row")

    bytes_per_row = calc_bytes_per_row(bmp_width, 3)
    whole_row_offset = pixel_array_offset + bytes_per_row * (bmp_height - y - 1)
    start_row_offset = whole_row_offset + x * 3
    end_row_offset = start_row_offset + row_width * 3
    return (start_row_offset, end_row_offset)

然后,您只需要处理适当的字节偏移即可.例如,假设您要读取10000x10000位图中从500x500位置开始的400x400块:

def process_row_bytes(row_bytes):
    ... some efficient way to process the bytes ...

bmpf = open(..., "rb")
pixel_array_offset = ... extract from bmp header ...
bmp_width = 10000
bmp_height = 10000
start_x = 500
start_y = 500
end_x = 500 + 400
end_y = 500 + 400

for cur_y in xrange(start_y, end_y):
    start, end = calc_row_offsets(pixel_array_offset, 
                                  bmp_width, bmp_height, 
                                  start_x, cur_y, 
                                  end_x - start_x)
    bmpf.seek(start)
    cur_row_bytes = bmpf.read(end - start)
    process_row_bytes(cur_row_bytes)

请注意,处理字节非常重要.您可能可以使用PIL进行一些巧妙的操作,然后将像素数据转储到其中,但是我不确定.如果您以低效的方式进行操作,则可能不值得.如果速度是一个很大的问题,您可以考虑使用 pyrex 或在C中实现上述功能,然后仅从Python调用它.

This might be a silly question, but...

I have several thousand images that I would like to load into Python and then convert into numpy arrays. Obviously this goes a little slowly. But, I am actually only interested in a small portion of each image. (The same portion, just 100x100 pixels in the center of the image.)

Is there any way to load just part of the image to make things go faster?

Here is some sample code where I generate some sample images, save them, and load them back in.

import numpy as np
import matplotlib.pyplot as plt
import Image, time

#Generate sample images
num_images = 5

for i in range(0,num_images):
    Z = np.random.rand(2000,2000)
    print 'saving %i'%i
    plt.imsave('%03i.png'%i,Z)

%load the images
for i in range(0,num_images):
    t = time.time()

    im = Image.open('%03i.png'%i)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))

    print 'Time to open: %.4f seconds'%(time.time()-t)

    #convert them to numpy arrays
    data = np.array(imc)

解决方案

Save your files as uncompressed 24-bit BMPs. These store pixel data in a very regular way. Check out the "Image Data" portion of this diagram from Wikipedia. Note that most of the complexity in the diagram is just from the headers:

For example, let's say you are storing this image (here shown zoomed in):

This is what the pixel data section looks like, if it's stored as a 24-bit uncompressed BMP. Note that the data is stored bottom-up, for some reason, and in BGR form instead of RGB, so the first line in the file is the bottom-most line of the image, the second line is the second-bottom-most, etc:

00 00 FF    FF FF FF    00 00
FF 00 00    00 FF 00    00 00

That data is explained as follows:

           |  First column  |  Second Column  |  Padding
-----------+----------------+-----------------+-----------
Second Row |  00 00 FF      |  FF FF FF       |  00 00
-----------+----------------+-----------------+-----------
First Row  |  FF 00 00      |  00 FF 00       |  00 00
-----------+----------------+-----------------+-----------

or:

           |  First column  |  Second Column  |  Padding
-----------+----------------+-----------------+-----------
Second Row |  red           |  white          |  00 00
-----------+----------------+-----------------+-----------
First Row  |  blue          |  green          |  00 00
-----------+----------------+-----------------+-----------

The padding is there to pad the row size to a multiple of 4 bytes.


So, all you have to do is implement a reader for this particular file format, and then calculate the byte offset of where you have to start and stop reading each row:

def calc_bytes_per_row(width, bytes_per_pixel):
    res = width * bytes_per_pixel
    if res % 4 != 0:
        res += 4 - res % 4
    return res

def calc_row_offsets(pixel_array_offset, bmp_width, bmp_height, x, y, row_width):
    if x + row_width > bmp_width:
        raise ValueError("This is only for calculating offsets within a row")

    bytes_per_row = calc_bytes_per_row(bmp_width, 3)
    whole_row_offset = pixel_array_offset + bytes_per_row * (bmp_height - y - 1)
    start_row_offset = whole_row_offset + x * 3
    end_row_offset = start_row_offset + row_width * 3
    return (start_row_offset, end_row_offset)

Then you just have to process the proper byte offsets. For example, say you want to read the 400x400 chunk starting at position 500x500 in a 10000x10000 bitmap:

def process_row_bytes(row_bytes):
    ... some efficient way to process the bytes ...

bmpf = open(..., "rb")
pixel_array_offset = ... extract from bmp header ...
bmp_width = 10000
bmp_height = 10000
start_x = 500
start_y = 500
end_x = 500 + 400
end_y = 500 + 400

for cur_y in xrange(start_y, end_y):
    start, end = calc_row_offsets(pixel_array_offset, 
                                  bmp_width, bmp_height, 
                                  start_x, cur_y, 
                                  end_x - start_x)
    bmpf.seek(start)
    cur_row_bytes = bmpf.read(end - start)
    process_row_bytes(cur_row_bytes)

Note that it's important how you process the bytes. You can probably do something clever using PIL and just dumping the pixel data into it but I'm not entirely sure. If you do it in an inefficient manner then it might not be worth it. If speed is a huge concern, you might consider writing it with pyrex or implementing the above in C and just calling it from Python.

这篇关于在python中仅加载图像的一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆