Pytorch 的“折叠"是如何实现的?和“展开"工作? [英] How does Pytorch's "Fold" and "Unfold" work?
问题描述
我已经阅读了官方文档.我很难理解这个函数的用途以及它是如何工作的.有人可以用外行的术语解释这一点吗?
I've gone through the official doc. I'm having a hard time understanding what this function is used for and how it works. Can someone explain this in Layman terms?
我收到他们提供的示例的错误,尽管我使用的 Pytorch 版本与文档匹配.也许修复我所做的错误应该教我一些东西?文档中给出的片段是:
I get an error for the example they provide, although the Pytorch version I'm using matches the documentation. Perhaps fixing the error, which I did, is supposed to teach me something? The snippet given in the documentation is:
fold = nn.Fold(output_size=(4, 5), kernel_size=(2, 2))
input = torch.randn(1, 3 * 2 * 2, 1)
output = fold(input)
output.size()
固定片段是:
fold = nn.Fold(output_size=(4, 5), kernel_size=(2, 2))
input = torch.randn(1, 3 * 2 * 2, 3 * 2 * 2)
output = fold(input)
output.size()
谢谢!
推荐答案
展开
和 fold
用于促进滑动窗口"操作(如卷积).
假设您想对特征图/图像中的每个 5x5 窗口应用一个函数 foo
:
from torch.nn import functional as f
windows = f.unfold(x, kernel_size=5)
现在windows
有size
的batch-(5*5*x.size(1)
)-num_windows,你可以应用<windows
上的 code>foo:
Now windows
has size
of batch-(5*5*x.size(1)
)-num_windows, you can apply foo
on windows
:
processed = foo(windows)
现在需要将processed
折叠"回x
的原始大小:
Now you need to "fold" processed
back to the original size of x
:
out = f.fold(processed, x.shape[-2:], kernel_size=5)
您需要注意padding
和kernel_size
,它们可能会影响您将processed
折叠"回processed
大小的能力代码>x.
此外,在重叠元素上fold
求和,因此您可能希望将fold
的输出除以补丁大小.
You need to take care of padding
, and kernel_size
that may affect your ability to "fold" back processed
to the size of x
.
Moreover, fold
sums over overlapping elements, so you might want to divide the output of fold
by patch size.
这篇关于Pytorch 的“折叠"是如何实现的?和“展开"工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!