PyTorch - nn.Linear 权重的形状 [英] PyTorch - shape of nn.Linear weights

查看:86
本文介绍了PyTorch - nn.Linear 权重的形状的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

昨天我遇到了 这个问题 并且第一次注意到线性层的权重nn.Linear 在应用matmul之前需要转置.

Yesterday I came across this question and for the first time noticed that the weights of the linear layer nn.Linear need to be transposed before applying matmul.

代码用于应用权重:

Code for applying the weights:

output = input.matmul(weight.t())


这是什么原因?

为什么权重不是一开始就在转置的形状中,所以在应用层之前不需要每次都转置?

Why are the weights not in the transposed shape just from the beginning, so they don't need to be transposed every time before applying the layer?

推荐答案

我在这里找到了答案:nn.Linear #2159 中的高效前向传递

I found an answer here: Efficient forward pass in nn.Linear #2159

这背后似乎没有真正的推理.然而转置操作似乎并没有减慢计算速度.

It seems like there is no real reasoning behind this. However the transpose operation doesn't seem to be slowing down the computation.

根据上面提到的问题,在转发过程中,转置操作在计算方面(几乎)是免费的.而在向后传递期间省略转置操作实际上会使当前实现的计算降低效率.

According to the issue mentioned above, during the forward pass the transpose operation is (almost) free in terms of computation. While during the backward pass leaving out the transpose operation would actually make computation less efficient with the current implementation.

该问题的最后一篇文章总结得很好:

The last post in that issue sums it up quite nicely:

这是历史权重布局,更改它是向后不兼容的.除非在速度或便利方面有很大的好处,否则我们不会破坏用户空间.

It's historical weight layout, changing it is backward-incompatible. Unless there is some BIG benefit in terms of speed or convenience, we wont break userland.

https://github.com/pytorch/pytorch/问题/2159#issuecomment-390068272

这篇关于PyTorch - nn.Linear 权重的形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆