使用blockproc或im2col在图像上重叠滑动窗口? [英] Overlapping sliding window over an image using blockproc or im2col?

查看:282
本文介绍了使用blockproc或im2col在图像上重叠滑动窗口?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须将 dct2 应用于我的图片的小窗口,最好是使用重叠窗口。

I have to apply dct2 to small windows of my image, preferably by using an overlapping window.

我发现Matlab中有两个函数可以实现 blockproc im2col 。我也很难理解,并希望得到一些澄清。

I have found out that there are two functions in Matlab that can make this possible blockproc and im2col. I am having trouble understanding either and would appreciate some clarification.

blockproc 可以用来实现我的功能使用 BorderSize TrimBorder 参数滑动窗口。

blockproc can be used to implement my function on a sliding window using the BorderSize and TrimBorder arguments.

B = blockproc(A,[64,64],fun,'BorderSize',[5,5], 'TrimBorder', 'false');

我意识到这会创建一个的块[64 + 2 * 5, 64 + 2 * 5] 并在每个块上应用函数 @fun 。但是因为我无法在调试中进入我的函数 @fun 以验证操作是否正确,所以我不能确定这是我需要的。
我的上述代码是否符合我的要求?我知道我在 B 中得到了连接结果,但它应该在重叠滑块上。这会实现我的需要吗?

I realize that this creates a block of [64 + 2*5, 64 + 2*5] and applies the function @fun on each block. But since I cannot go into my function @fun in debugging to verify proper operation I cannot be sure this is what I need. Is my above code correct for what I need? I know that I get a concatenated result in B but it should be on a overlapping sliding block. Will this achieve what I need?

第二个是 im2col im2col(A,[m n],block_type)会将块分成m个n块并将它们排列成列,所以每列都是一个块?如果是这样,重叠如何控制?如果每个块都是一列,我可以在每列上成功应用 dct2 函数吗?因为我怀疑它会将向量作为输入吗?

The second is im2col. im2col(A,[m n],block_type) will divide the block into m by n blocks and arrange them in columns, so each column is a block? If so, how is the overlapping controlled? And if each block is a column can I successfully apply the dct2 function on each column? Because I doubt it will take vectors as input?

我们将非常感谢一些澄清。

Some clarification would be greatly appreciated.

推荐答案

好的,这是一个非常复杂的问题。我会尝试将其分解为单独的部分并分别回答每个问题。

OK, this is quite a complex question. I'll try and break this up into separate parts and will answer each question separately.


blockproc 可用于使用 BorderSize 在滑动窗口上实现我的功能 TrimBorder 参数。

blockproc can be used to implement my function on a sliding window using the BorderSize and TrimBorder arguments.

B = blockproc(A,[64,64],fun,'BorderSize',[5,5],'TrimBorder',' false');

我意识到这会创建一个的块[64 + 2 * 5,64 + 2 * 5] 并在每个块上应用函数 @fun 。但是因为我无法在调试中进入我的函数 @fun 以验证操作是否正确,所以我不能确定这是我需要的。我的上述代码是否符合我的要求?我知道我在 B 中得到了连接结果,但它应该在重叠滑块上。这会实现我的需求吗?

I realize that this creates a block of [64 + 2*5, 64 + 2*5] and applies the function @fun on each block. But since I cannot go into my function @fun in debugging to verify proper operation I cannot be sure this is what I need. Is my above code correct for what I need? I know that I get a concatenated result in B but it should be on a overlapping sliding block. Will this achieve what I need?

在试验 blockproc 后,这个确实是正确的,你可以使用它来滑动邻居处理工作。但是,您将需要一个额外的标志,即 PadPartialBlocks 。此标志的用途是,如果要提取一个位于图像外边缘的块,并且无法生成指定大小的块,则将对此部分块进行零填充以使其符合大小相同。这是一个使用滑动窗口的小例子。假设我们有一个矩阵,以便:

After experimenting around with blockproc, this is indeed correct where you can use it to get sliding neighbourhood processing working. However, you're going to need an additional flag, which is PadPartialBlocks. The purpose of this flag is so that if you are extracting a block where you're at the outer edges of the image and you can't make a block of a specified size, this will zero-pad this partial block so that it conforms to the same size. Here's a small example to get this working with sliding windows. Supposing we had a matrix such that:

>> A = reshape(1:25,5,5)

A =

     1     6    11    16    21
     2     7    12    17    22
     3     8    13    18    23
     4     9    14    19    24
     5    10    15    20    25

假设我们想要获取上面矩阵中每个3 x 3重叠邻域的平均值,并将那些超出矩阵边界的元素填零。您可以使用 blockproc 执行此操作:

Let's say we wanted to take the average of each 3 x 3 overlapping neighbourhood in the matrix above, and zero-padding those elements that go beyond the borders of the matrix. You would do this with blockproc:

B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [1 1], 'TrimBorder', false, 'PadPartialBlocks', true);

需要注意的重要一点是块大小,在这种情况下为1 x 1且 BorderSize 也是1 x 1,其设置与3 x 3块的预期不同。为了解决这种情况的原因,我们需要进一步了解 BorderSize 的工作原理。对于块的给定中心, BorderSize 允许您捕获超出原始大小的块的维度的值/像素。对于那些超出矩阵边界的位置,我们默认将这些位置填充为零。 BorderSize 允许我们更多地捕获 2M + 2N 像素,其中 M N 是您想要的水平和垂直边框大小。这将允许我们在原始块的上方和下方捕获 M 更多像素,并在左侧和 N 更多像素原始块右侧。

What's important to note is that the block size, which is 1 x 1 in this case and BorderSize which is 1 x 1 as well are set differently than what you'd expect for a 3 x 3 block. To go into why this is the case, we need some further insight on how BorderSize works. For a given centre of a block, BorderSize allows you to capture values / pixels beyond the dimensions of the originally sized block. For those locations that go beyond the borders of the matrix, we would pad these locations as zero by default. BorderSize allows us to capture 2M + 2N pixels more, where M and N are the horizontal and vertical border size you want. This would allow us to capture M more pixels both above and below the original block and N more pixels to the left and right of the original block.

因此,对于 A 中的值1,如果块大小为1 x 1,这意味着该元素只包含1,如果我们的 BorderSize 是1 x 1.这意味着我们的最后一个块将是:

Therefore, for the value of 1 in A, if the block size is 1 x 1, this means that the element consists of only 1, and if our BorderSize was 1 x 1. This means our final block would be:

0  0  0
0  1  6
0  2  7

因为我们的块大小是1,所以下一个块将以6为中心,我们将获得3 x 3像素网格,依此类推。将 TrimBorder 设置为 false 也很重要,这样我们就可以保留最初在扩展时捕获的像素块。默认设置为 true 。最后, PadPartialBlocks true ,以确保所有块的大小相同。当你运行上面的代码时,我们得到的结果是:

Because our block size is 1, the next block would be centred at 6, and we would get a 3 x 3 grid of pixels and so on. It is also important that TrimBorder is set to false so that we can keep those pixels that were originally captured upon expansion of the block. The default is set to true. Finally, PadPartialBlocks is true to ensure that all blocks are the same size. When you run the above code, the result we get is:

B =

    1.7778    4.3333    7.6667   11.0000    8.4444
    3.0000    7.0000   12.0000   17.0000   13.0000
    3.6667    8.0000   13.0000   18.0000   13.6667
    4.3333    9.0000   14.0000   19.0000   14.3333
    3.1111    6.3333    9.6667   13.0000    9.7778

您可以使用 nlfilter 我们可以将平均值应用于3 x 3滑动区域:

You can verify that we get the same result using nlfilter where we can apply the mean to 3 x 3 sliding neighbourhoods:

C = nlfilter(A, [3 3], @(x) mean(x(:)))

C =

    1.7778    4.3333    7.6667   11.0000    8.4444
    3.0000    7.0000   12.0000   17.0000   13.0000
    3.6667    8.0000   13.0000   18.0000   13.6667
    4.3333    9.0000   14.0000   19.0000   14.3333
    3.1111    6.3333    9.6667   13.0000    9.7778

因此,如果要正确使用 blockproc 进行滑动操作,则需要分别设置块大小和边框大小。在这种情况下,一般规则是始终将块大小设置为1 x 1,并允许 BorderSize 指定所需的每个块的大小。具体来说,对于大小 K x K 的块,您可以将 BorderSize 设置为 floor (K / 2)x楼层(K / 2)。如果 K 是奇数,这会让事情变得简单。

As such, if you want to properly use blockproc for sliding operations, you need to be careful on how you set the block size and border size respectively. In this case, the general rule is to always set your block size to be 1 x 1, and allow BorderSize to specify the size of each block you want. Specifically, for a block of size K x K, you would set BorderSize to be floor(K/2) x floor(K/2) respectively. It would make things easy if K was odd.

例如,如果你想要 5 x 5 基于滑动窗口的平均过滤操作,您可以将 BorderSize 设置为 [2 2] ,如 K = 5 floor(K / 2)= 2 。因此,你会这样做:

For example, if you wanted a 5 x 5 mean filtering operation on a sliding window basis, you would set BorderSize to [2 2], as K = 5 and floor(K/2) = 2. Therefore, you would do this:

B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [2 2], 'TrimBorder', false, 'PadPartialBlocks', true)

B =

    2.5200    4.5600    7.2000    6.9600    6.1200
    3.6000    6.4000   10.0000    9.6000    8.4000
    4.8000    8.4000   13.0000   12.4000   10.8000
    4.0800    7.0400   10.8000   10.2400    8.8800
    3.2400    5.5200    8.4000    7.9200    6.8400

用尺寸为5 x 5的 nlfilter 复制此项也会给出:

Replicating this with nlfilter with a size of 5 x 5 also gives:

C = nlfilter(A, [5 5], @(x) mean(x(:)))

C =

    2.5200    4.5600    7.2000    6.9600    6.1200
    3.6000    6.4000   10.0000    9.6000    8.4000
    4.8000    8.4000   13.0000   12.4000   10.8000
    4.0800    7.0400   10.8000   10.2400    8.8800
    3.2400    5.5200    8.4000    7.9200    6.8400

我是做一些时间测试,似乎在这个上下文中使用的 blockproc nlfilter 更快。

I was doing some timing tests, and it seems that blockproc used in this context is faster than nlfilter.


第二个是 im2col im2col(A,[m n],block_type)会将块分成m个n块并将它们排列成列,所以每列都是一个块?如果是这样,重叠如何控制?如果每个块都是一列,我可以在每列上成功应用 dct2 函数吗?因为我怀疑它会将向量作为输入吗?

The second is im2col. im2col(A,[m n],block_type) will divide the block into m by n blocks and arrange them in columns, so each column is a block? If so, how is the overlapping controlled? And if each block is a column can I successfully apply the dct2 function on each column? Because I doubt it will take vectors as input?

你是正确的 im2col 将每个像素邻域或块转换为单个列,并且这些列的串联形成输出矩阵。您可以通过 block_type 参数控制块是重叠还是不同。指定 distinct 滑动(这是默认值)来控制它。您还可以使用 m n 来控制每个社区的大小。

You are correct in that im2col transforms each pixel neighbourhood or block into a single column and the concatenation of these columns forms the output matrix. You can control whether the blocks are overlapping or are distinct by the block_type parameter. Specify distinct or sliding (which is default) to control this. You can also control the size of each neighbourhood with m and n.

但是,如果您的目标是使用 im2col <的输出应用 dct2 / code>,那么你就不会得到你想要的东西。具体来说, dct2 会考虑2D数据中每个数据点的空间位置,并将其用作转换的一部分。通过将每个像素邻域变换为单个列,最初存在于每个块的2D空间关系现在消失了。 dct2 期望2D空间数据,但您将指定1D数据。因此, im2col 可能不是你想要的。如果我理解你想要的正确,你会想要使用 blockproc

However, if it is your goal to apply dct2 with the output of im2col, then you will not get what you desire. Specifically, dct2 takes the spatial location of each data point within your 2D data into account and is used as part of the transform. By transforming each pixel neighbourhood into a single column, the 2D spatial relationships that were originally there for each block are now gone. dct2 expects 2D spatial data, but you would be specifying 1D data instead. As such, im2col is probably not what you're looking for. If I understand what you want correctly, you'll want to use blockproc instead.

希望这会有所帮助!

这篇关于使用blockproc或im2col在图像上重叠滑动窗口?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆