使用blockproc或im2col在图像上重叠滑动窗口? [英] Overlapping sliding window over an image using blockproc or im2col?
问题描述
我必须将 dct2
应用于我的图片的小窗口,最好是使用重叠窗口。
I have to apply dct2
to small windows of my image, preferably by using an overlapping window.
我发现Matlab中有两个函数可以实现 blockproc
和 im2col
。我也很难理解,并希望得到一些澄清。
I have found out that there are two functions in Matlab that can make this possible blockproc
and im2col
. I am having trouble understanding either and would appreciate some clarification.
blockproc
可以用来实现我的功能使用 BorderSize
和 TrimBorder
参数滑动窗口。
blockproc
can be used to implement my function on a sliding window using the BorderSize
and TrimBorder
arguments.
B = blockproc(A,[64,64],fun,'BorderSize',[5,5], 'TrimBorder', 'false');
我意识到这会创建一个的块[64 + 2 * 5, 64 + 2 * 5]
并在每个块上应用函数 @fun
。但是因为我无法在调试中进入我的函数 @fun
以验证操作是否正确,所以我不能确定这是我需要的。
我的上述代码是否符合我的要求?我知道我在 B
中得到了连接结果,但它应该在重叠滑块上。这会实现我的需要吗?
I realize that this creates a block of [64 + 2*5, 64 + 2*5]
and applies the function @fun
on each block. But since I cannot go into my function @fun
in debugging to verify proper operation I cannot be sure this is what I need.
Is my above code correct for what I need? I know that I get a concatenated result in B
but it should be on a overlapping sliding block. Will this achieve what I need?
第二个是 im2col
。 im2col(A,[m n],block_type)
会将块分成m个n块并将它们排列成列,所以每列都是一个块?如果是这样,重叠如何控制?如果每个块都是一列,我可以在每列上成功应用 dct2
函数吗?因为我怀疑它会将向量作为输入吗?
The second is im2col
. im2col(A,[m n],block_type)
will divide the block into m by n blocks and arrange them in columns, so each column is a block? If so, how is the overlapping controlled? And if each block is a column can I successfully apply the dct2
function on each column? Because I doubt it will take vectors as input?
我们将非常感谢一些澄清。
Some clarification would be greatly appreciated.
推荐答案
好的,这是一个非常复杂的问题。我会尝试将其分解为单独的部分并分别回答每个问题。
OK, this is quite a complex question. I'll try and break this up into separate parts and will answer each question separately.
blockproc
可用于使用BorderSize
在滑动窗口上实现我的功能TrimBorder
参数。
blockproc
can be used to implement my function on a sliding window using theBorderSize
andTrimBorder
arguments.
B = blockproc(A,[64,64],fun,'BorderSize',[5,5],'TrimBorder',' false');
我意识到这会创建一个的块[64 + 2 * 5,64 + 2 * 5]
并在每个块上应用函数 @fun
。但是因为我无法在调试中进入我的函数 @fun
以验证操作是否正确,所以我不能确定这是我需要的。我的上述代码是否符合我的要求?我知道我在 B
中得到了连接结果,但它应该在重叠滑块上。这会实现我的需求吗?
I realize that this creates a block of [64 + 2*5, 64 + 2*5]
and applies the function @fun
on each block. But since I cannot go into my function @fun
in debugging to verify proper operation I cannot be sure this is what I need. Is my above code correct for what I need? I know that I get a concatenated result in B
but it should be on a overlapping sliding block. Will this achieve what I need?
在试验 blockproc
后,这个确实是正确的,你可以使用它来滑动邻居处理工作。但是,您将需要一个额外的标志,即 PadPartialBlocks
。此标志的用途是,如果要提取一个位于图像外边缘的块,并且无法生成指定大小的块,则将对此部分块进行零填充以使其符合大小相同。这是一个使用滑动窗口的小例子。假设我们有一个矩阵,以便:
After experimenting around with blockproc
, this is indeed correct where you can use it to get sliding neighbourhood processing working. However, you're going to need an additional flag, which is PadPartialBlocks
. The purpose of this flag is so that if you are extracting a block where you're at the outer edges of the image and you can't make a block of a specified size, this will zero-pad this partial block so that it conforms to the same size. Here's a small example to get this working with sliding windows. Supposing we had a matrix such that:
>> A = reshape(1:25,5,5)
A =
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
假设我们想要获取上面矩阵中每个3 x 3重叠邻域的平均值,并将那些超出矩阵边界的元素填零。您可以使用 blockproc
执行此操作:
Let's say we wanted to take the average of each 3 x 3 overlapping neighbourhood in the matrix above, and zero-padding those elements that go beyond the borders of the matrix. You would do this with blockproc
:
B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [1 1], 'TrimBorder', false, 'PadPartialBlocks', true);
需要注意的重要一点是块大小,在这种情况下为1 x 1且 BorderSize
也是1 x 1,其设置与3 x 3块的预期不同。为了解决这种情况的原因,我们需要进一步了解 BorderSize
的工作原理。对于块的给定中心, BorderSize
允许您捕获超出原始大小的块的维度的值/像素。对于那些超出矩阵边界的位置,我们默认将这些位置填充为零。 BorderSize
允许我们更多地捕获 2M + 2N
像素,其中 M
和 N
是您想要的水平和垂直边框大小。这将允许我们在原始块的上方和下方捕获 M
更多像素,并在左侧和 N
更多像素原始块右侧。
What's important to note is that the block size, which is 1 x 1 in this case and BorderSize
which is 1 x 1 as well are set differently than what you'd expect for a 3 x 3 block. To go into why this is the case, we need some further insight on how BorderSize
works. For a given centre of a block, BorderSize
allows you to capture values / pixels beyond the dimensions of the originally sized block. For those locations that go beyond the borders of the matrix, we would pad these locations as zero by default. BorderSize
allows us to capture 2M + 2N
pixels more, where M
and N
are the horizontal and vertical border size you want. This would allow us to capture M
more pixels both above and below the original block and N
more pixels to the left and right of the original block.
因此,对于 A
中的值1,如果块大小为1 x 1,这意味着该元素只包含1,如果我们的 BorderSize
是1 x 1.这意味着我们的最后一个块将是:
Therefore, for the value of 1 in A
, if the block size is 1 x 1, this means that the element consists of only 1, and if our BorderSize
was 1 x 1. This means our final block would be:
0 0 0
0 1 6
0 2 7
因为我们的块大小是1,所以下一个块将以6为中心,我们将获得3 x 3像素网格,依此类推。将 TrimBorder
设置为 false
也很重要,这样我们就可以保留最初在扩展时捕获的像素块。默认设置为 true
。最后, PadPartialBlocks
是 true
,以确保所有块的大小相同。当你运行上面的代码时,我们得到的结果是:
Because our block size is 1, the next block would be centred at 6, and we would get a 3 x 3 grid of pixels and so on. It is also important that TrimBorder
is set to false
so that we can keep those pixels that were originally captured upon expansion of the block. The default is set to true
. Finally, PadPartialBlocks
is true
to ensure that all blocks are the same size. When you run the above code, the result we get is:
B =
1.7778 4.3333 7.6667 11.0000 8.4444
3.0000 7.0000 12.0000 17.0000 13.0000
3.6667 8.0000 13.0000 18.0000 13.6667
4.3333 9.0000 14.0000 19.0000 14.3333
3.1111 6.3333 9.6667 13.0000 9.7778
您可以使用 nlfilter
我们可以将平均值应用于3 x 3滑动区域:
You can verify that we get the same result using nlfilter
where we can apply the mean to 3 x 3 sliding neighbourhoods:
C = nlfilter(A, [3 3], @(x) mean(x(:)))
C =
1.7778 4.3333 7.6667 11.0000 8.4444
3.0000 7.0000 12.0000 17.0000 13.0000
3.6667 8.0000 13.0000 18.0000 13.6667
4.3333 9.0000 14.0000 19.0000 14.3333
3.1111 6.3333 9.6667 13.0000 9.7778
因此,如果要正确使用 blockproc
进行滑动操作,则需要分别设置块大小和边框大小。在这种情况下,一般规则是始终将块大小设置为1 x 1,并允许 BorderSize
指定所需的每个块的大小。具体来说,对于大小 K x K
的块,您可以将 BorderSize
设置为 floor (K / 2)x楼层(K / 2)
。如果 K
是奇数,这会让事情变得简单。
As such, if you want to properly use blockproc
for sliding operations, you need to be careful on how you set the block size and border size respectively. In this case, the general rule is to always set your block size to be 1 x 1, and allow BorderSize
to specify the size of each block you want. Specifically, for a block of size K x K
, you would set BorderSize
to be floor(K/2) x floor(K/2)
respectively. It would make things easy if K
was odd.
例如,如果你想要 5 x 5
基于滑动窗口的平均过滤操作,您可以将 BorderSize
设置为 [2 2]
,如 K = 5
和 floor(K / 2)= 2
。因此,你会这样做:
For example, if you wanted a 5 x 5
mean filtering operation on a sliding window basis, you would set BorderSize
to [2 2]
, as K = 5
and floor(K/2) = 2
. Therefore, you would do this:
B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [2 2], 'TrimBorder', false, 'PadPartialBlocks', true)
B =
2.5200 4.5600 7.2000 6.9600 6.1200
3.6000 6.4000 10.0000 9.6000 8.4000
4.8000 8.4000 13.0000 12.4000 10.8000
4.0800 7.0400 10.8000 10.2400 8.8800
3.2400 5.5200 8.4000 7.9200 6.8400
用尺寸为5 x 5的 nlfilter
复制此项也会给出:
Replicating this with nlfilter
with a size of 5 x 5 also gives:
C = nlfilter(A, [5 5], @(x) mean(x(:)))
C =
2.5200 4.5600 7.2000 6.9600 6.1200
3.6000 6.4000 10.0000 9.6000 8.4000
4.8000 8.4000 13.0000 12.4000 10.8000
4.0800 7.0400 10.8000 10.2400 8.8800
3.2400 5.5200 8.4000 7.9200 6.8400
我是做一些时间测试,似乎在这个上下文中使用的 blockproc
比 nlfilter
更快。
I was doing some timing tests, and it seems that blockproc
used in this context is faster than nlfilter
.
第二个是
im2col
。im2col(A,[m n],block_type)
会将块分成m个n块并将它们排列成列,所以每列都是一个块?如果是这样,重叠如何控制?如果每个块都是一列,我可以在每列上成功应用dct2
函数吗?因为我怀疑它会将向量作为输入吗?
The second is
im2col
.im2col(A,[m n],block_type)
will divide the block into m by n blocks and arrange them in columns, so each column is a block? If so, how is the overlapping controlled? And if each block is a column can I successfully apply thedct2
function on each column? Because I doubt it will take vectors as input?
你是正确的 im2col
将每个像素邻域或块转换为单个列,并且这些列的串联形成输出矩阵。您可以通过 block_type
参数控制块是重叠还是不同。指定 distinct
或滑动
(这是默认值)来控制它。您还可以使用 m
和 n
来控制每个社区的大小。
You are correct in that im2col
transforms each pixel neighbourhood or block into a single column and the concatenation of these columns forms the output matrix. You can control whether the blocks are overlapping or are distinct by the block_type
parameter. Specify distinct
or sliding
(which is default) to control this. You can also control the size of each neighbourhood with m
and n
.
但是,如果您的目标是使用 im2col <的输出应用
dct2
/ code>,那么你就不会得到你想要的东西。具体来说, dct2
会考虑2D数据中每个数据点的空间位置,并将其用作转换的一部分。通过将每个像素邻域变换为单个列,最初存在于每个块的2D空间关系现在消失了。 dct2
期望2D空间数据,但您将指定1D数据。因此, im2col
可能不是你想要的。如果我理解你想要的正确,你会想要使用 blockproc
。
However, if it is your goal to apply dct2
with the output of im2col
, then you will not get what you desire. Specifically, dct2
takes the spatial location of each data point within your 2D data into account and is used as part of the transform. By transforming each pixel neighbourhood into a single column, the 2D spatial relationships that were originally there for each block are now gone. dct2
expects 2D spatial data, but you would be specifying 1D data instead. As such, im2col
is probably not what you're looking for. If I understand what you want correctly, you'll want to use blockproc
instead.
希望这会有所帮助!
这篇关于使用blockproc或im2col在图像上重叠滑动窗口?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!