如何在 Pytorch 中应用分层学习率? [英] How to apply layer-wise learning rate in Pytorch?

查看:23
本文介绍了如何在 Pytorch 中应用分层学习率?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道可以冻结网络中的单层,例如只训练预训练模型的最后一层.我正在寻找的是一种将特定学习率应用于不同层的方法.

I know that it is possible to freeze single layers in a network for example to train only the last layers of a pre-trained model. What I’m looking for is a way to apply certain learning rates to different layers.

例如,第一层的学习率非常低,为 0.000001,然后逐渐增加后续每一层的学习率.所以最后一层的学习率为 0.01 左右.

So for example a very low learning rate of 0.000001 for the first layer and then increasing the learning rate gradually for each of the following layers. So that the last layer then ends up with a learning rate of 0.01 or so.

这在pytorch中可行吗?知道如何存档吗?

Is this possible in pytorch? Any idea how I can archive this?

推荐答案

解决方案如下:

from torch.optim import Adam

model = Net()

optim = Adam(
    [
        {"params": model.fc.parameters(), "lr": 1e-3},
        {"params": model.agroupoflayer.parameters()},
        {"params": model.lastlayer.parameters(), "lr": 4e-2},
    ],
    lr=5e-4,
)

其他未在优化器中指定的参数将不会优化.所以你应该说明所有的层或组(或者你想要优化的层).如果你没有指定学习率,它将采用全局学习率(5e-4).诀窍是当您创建模型时,您应该为图层命名,或者您可以将其分组.

Other parameters that are didn't specify in optimizer will not optimize. So you should state all layers or groups(OR the layers you want to optimize). and if you didn't specify the learning rate it will take the global learning rate(5e-4). The trick is when you create the model you should give names to the layers or you can group it.

这篇关于如何在 Pytorch 中应用分层学习率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆