PyTorch - nn.Linear 权重的形状 [英] PyTorch - shape of nn.Linear weights
问题描述
昨天我遇到了 这个问题 并且第一次注意到线性层的权重nn.Linear
在应用matmul
之前需要转置.
Yesterday I came across this question and for the first time noticed that the weights of the linear layer nn.Linear
need to be transposed before applying matmul
.
代码用于应用权重:
Code for applying the weights:
output = input.matmul(weight.t())
这是什么原因?
为什么权重不是一开始就在转置的形状中,所以在应用层之前不需要每次都转置?
Why are the weights not in the transposed shape just from the beginning, so they don't need to be transposed every time before applying the layer?
推荐答案
我在这里找到了答案:nn.Linear #2159 中的高效前向传递
I found an answer here: Efficient forward pass in nn.Linear #2159
这背后似乎没有真正的推理.然而转置操作似乎并没有减慢计算速度.
It seems like there is no real reasoning behind this. However the transpose operation doesn't seem to be slowing down the computation.
根据上面提到的问题,在转发过程中,转置操作在计算方面(几乎)是免费的.而在向后传递期间省略转置操作实际上会使当前实现的计算降低效率.
According to the issue mentioned above, during the forward pass the transpose operation is (almost) free in terms of computation. While during the backward pass leaving out the transpose operation would actually make computation less efficient with the current implementation.
该问题的最后一篇文章总结得很好:
The last post in that issue sums it up quite nicely:
这是历史权重布局,更改它是向后不兼容的.除非在速度或便利方面有很大的好处,否则我们不会破坏用户空间.
It's historical weight layout, changing it is backward-incompatible. Unless there is some BIG benefit in terms of speed or convenience, we wont break userland.
https://github.com/pytorch/pytorch/问题/2159#issuecomment-390068272
这篇关于PyTorch - nn.Linear 权重的形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!