numpy数组的边界框 [英] bounding box of numpy array

查看:79
本文介绍了numpy数组的边界框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设您有一个二维numpy数组,其中包含一些随机值,并且周围有零.

Suppose you have a 2D numpy array with some random values and surrounding zeros.

倾斜矩形"示例:

import numpy as np
from skimage import transform

img1 = np.zeros((100,100))
img1[25:75,25:75] = 1.
img2 = transform.rotate(img1, 45)

现在,我想为所有非零数据找到最小的边界矩形.例如:

Now I want to find the smallest bounding rectangle for all the nonzero data. For example:

a = np.where(img2 != 0)
bbox = img2[np.min(a[0]):np.max(a[0])+1, np.min(a[1]):np.max(a[1])+1]

达到此结果的最快方式是什么?我肯定有更好的方法,因为如果我例如np.where函数需要相当长的时间使用1000x1000数据集.

What would be the fastest way to achieve this result? I am sure there is a better way since the np.where function takes quite a time if I am e.g. using 1000x1000 data sets.

应该也可以在3D模式下工作...

Should also work in 3D...

推荐答案

通过使用np.any将包含非零值的行和列减少到一维向量,而不是查找索引,可以将执行时间大致减半.使用np.where的所有非零值:

You can roughly halve the execution time by using np.any to reduce the rows and columns that contain non-zero values to 1D vectors, rather than finding the indices of all non-zero values using np.where:

def bbox1(img):
    a = np.where(img != 0)
    bbox = np.min(a[0]), np.max(a[0]), np.min(a[1]), np.max(a[1])
    return bbox

def bbox2(img):
    rows = np.any(img, axis=1)
    cols = np.any(img, axis=0)
    rmin, rmax = np.where(rows)[0][[0, -1]]
    cmin, cmax = np.where(cols)[0][[0, -1]]

    return rmin, rmax, cmin, cmax

一些基准:

%timeit bbox1(img2)
10000 loops, best of 3: 63.5 µs per loop

%timeit bbox2(img2)
10000 loops, best of 3: 37.1 µs per loop

将此方法扩展到3D情况仅涉及沿每对轴执行归约:

Extending this approach to the 3D case just involves performing the reduction along each pair of axes:

def bbox2_3D(img):

    r = np.any(img, axis=(1, 2))
    c = np.any(img, axis=(0, 2))
    z = np.any(img, axis=(0, 1))

    rmin, rmax = np.where(r)[0][[0, -1]]
    cmin, cmax = np.where(c)[0][[0, -1]]
    zmin, zmax = np.where(z)[0][[0, -1]]

    return rmin, rmax, cmin, cmax, zmin, zmax

通过使用itertools.combinations遍历轴的每个唯一组合来执行归约,很容易将其概括为 N 个维度:

It's easy to generalize this to N dimensions by using itertools.combinations to iterate over each unique combination of axes to perform the reduction over:

import itertools

def bbox2_ND(img):
    N = img.ndim
    out = []
    for ax in itertools.combinations(reversed(range(N)), N - 1):
        nonzero = np.any(img, axis=ax)
        out.extend(np.where(nonzero)[0][[0, -1]])
    return tuple(out)


如果您知道原始边界框角的坐标,旋转角度和旋转中心,则可以通过计算相应的


If you know the coordinates of the corners of the original bounding box, the angle of rotation, and the centre of rotation, you could get the coordinates of the transformed bounding box corners directly by computing the corresponding affine transformation matrix and dotting it with the input coordinates:

def bbox_rotate(bbox_in, angle, centre):

    rmin, rmax, cmin, cmax = bbox_in

    # bounding box corners in homogeneous coordinates
    xyz_in = np.array(([[cmin, cmin, cmax, cmax],
                        [rmin, rmax, rmin, rmax],
                        [   1,    1,    1,    1]]))

    # translate centre to origin
    cr, cc = centre
    cent2ori = np.eye(3)
    cent2ori[:2, 2] = -cr, -cc

    # rotate about the origin
    theta = np.deg2rad(angle)
    rmat = np.eye(3)
    rmat[:2, :2] = np.array([[ np.cos(theta),-np.sin(theta)],
                             [ np.sin(theta), np.cos(theta)]])

    # translate from origin back to centre
    ori2cent = np.eye(3)
    ori2cent[:2, 2] = cr, cc

    # combine transformations (rightmost matrix is applied first)
    xyz_out = ori2cent.dot(rmat).dot(cent2ori).dot(xyz_in)

    r, c = xyz_out[:2]

    rmin = int(r.min())
    rmax = int(r.max())
    cmin = int(c.min())
    cmax = int(c.max())

    return rmin, rmax, cmin, cmax

对于小型示例数组,这比使用np.any的速度要快得多:

This works out to be very slightly faster than using np.any for your small example array:

%timeit bbox_rotate([25, 75, 25, 75], 45, (50, 50))
10000 loops, best of 3: 33 µs per loop

但是,由于此方法的速度与输入数组的大小无关,因此对于较大的数组,它的速度可能要快得多.

However, since the speed of this method is independent of the size of the input array, it can be quite a lot faster for larger arrays.

将变换方法扩展到3D稍微复杂一点,因为旋转现在具有三个不同的分量(一个绕x轴,一个绕y轴,一个绕z轴),但是基本方法一样:

Extending the transformation approach to 3D is slightly more complicated, in that the rotation now has three different components (one about the x-axis, one about the y-axis and one about the z-axis), but the basic method is the same:

def bbox_rotate_3d(bbox_in, angle_x, angle_y, angle_z, centre):

    rmin, rmax, cmin, cmax, zmin, zmax = bbox_in

    # bounding box corners in homogeneous coordinates
    xyzu_in = np.array(([[cmin, cmin, cmin, cmin, cmax, cmax, cmax, cmax],
                         [rmin, rmin, rmax, rmax, rmin, rmin, rmax, rmax],
                         [zmin, zmax, zmin, zmax, zmin, zmax, zmin, zmax],
                         [   1,    1,    1,    1,    1,    1,    1,    1]]))

    # translate centre to origin
    cr, cc, cz = centre
    cent2ori = np.eye(4)
    cent2ori[:3, 3] = -cr, -cc -cz

    # rotation about the x-axis
    theta = np.deg2rad(angle_x)
    rmat_x = np.eye(4)
    rmat_x[1:3, 1:3] = np.array([[ np.cos(theta),-np.sin(theta)],
                                 [ np.sin(theta), np.cos(theta)]])

    # rotation about the y-axis
    theta = np.deg2rad(angle_y)
    rmat_y = np.eye(4)
    rmat_y[[0, 0, 2, 2], [0, 2, 0, 2]] = (
        np.cos(theta), np.sin(theta), -np.sin(theta), np.cos(theta))

    # rotation about the z-axis
    theta = np.deg2rad(angle_z)
    rmat_z = np.eye(4)
    rmat_z[:2, :2] = np.array([[ np.cos(theta),-np.sin(theta)],
                               [ np.sin(theta), np.cos(theta)]])

    # translate from origin back to centre
    ori2cent = np.eye(4)
    ori2cent[:3, 3] = cr, cc, cz

    # combine transformations (rightmost matrix is applied first)
    tform = ori2cent.dot(rmat_z).dot(rmat_y).dot(rmat_x).dot(cent2ori)
    xyzu_out = tform.dot(xyzu_in)

    r, c, z = xyzu_out[:3]

    rmin = int(r.min())
    rmax = int(r.max())
    cmin = int(c.min())
    cmax = int(c.max())
    zmin = int(z.min())
    zmax = int(z.max())

    return rmin, rmax, cmin, cmax, zmin, zmax

我基本上已经使用此处中的旋转矩阵表达式修改了上面的函数-我还没有时间编写测试用例,因此请谨慎使用.

I've essentially just modified the function above using the rotation matrix expressions from here - I haven't had time to write a test-case yet, so use with caution.

这篇关于numpy数组的边界框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆