如何在Pytorch中应用分层学习率? [英] How to apply layer-wise learning rate in Pytorch?

查看:682
本文介绍了如何在Pytorch中应用分层学习率?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道可以冻结网络中的单个层,例如仅训练预训练模型的最后一层.我正在寻找一种将某些学习率应用于不同层次的方法.

I know that it is possible to freeze single layers in a network for example to train only the last layers of a pre-trained model. What I’m looking for is a way to apply certain learning rates to different layers.

因此,例如,第一层的学习率非常低,为0.000001,然后逐渐增加后续层的学习率.这样最后一层的学习率就达到0.01左右.

So for example a very low learning rate of 0.000001 for the first layer and then increasing the learning rate gradually for each of the following layers. So that the last layer then ends up with a learning rate of 0.01 or so.

在pytorch中有可能吗?知道如何存档吗?

Is this possible in pytorch? Any idea how I can archive this?

推荐答案

以下是解决方法:

from torch.optim import Adam

model = Net()

optim = Adam(
    [
        {"params": model.fc.parameters(), "lr": 1e-3},
        {"params": model.agroupoflayer.parameters()},
        {"params": model.lastlayer.parameters(), "lr": 4e-2},
    ],
    lr=5e-4,
)

未在优化器中指定的其他参数将不会优化.因此,您应该声明所有层或组(或要优化的层).,如果未指定学习率,它将采用全局学习率(5e-4). 诀窍是在创建模型时,应为图层命名或将其分组.

Other parameters that are didn't specify in optimizer will not optimize. So you should state all layers or groups(OR the layers you want to optimize). and if you didn't specify the learning rate it will take the global learning rate(5e-4). The trick is when you create the model you should give names to the layers or you can group it.

这篇关于如何在Pytorch中应用分层学习率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆