RMSprop,Adam,AdaDelta使用Caffe不会提高测试精度 [英] RMSprop, Adam, AdaDelta test accuracy does not improve using Caffe
问题描述
我正在finetuning
在Tesla K40
上的图像数据集上使用Caffe
.使用batch size=47
,solver_type=SGD
,base_lr=0.001
,lr_policy="step"
,momentum=0.9
,gamma=0.1
,在100
迭代中training loss
会减小,而test accuracy
从2%-50%
开始,这是相当不错的.
I am finetuning
using Caffe
on an image dataset on a Tesla K40
. Using a batch size=47
, solver_type=SGD
, base_lr=0.001
, lr_policy="step"
, momentum=0.9
, gamma=0.1
, the training loss
decreases and test accuracy
goes from 2%-50%
in 100
iterations which is quite good.
当使用其他优化器(例如RMSPROP
,ADAM
和ADADELTA
)时,training loss
几乎保持相同,即使test accuracy
迭代后test accuracy
也没有改善.
When using other optimisers such as RMSPROP
, ADAM
and ADADELTA
, the training loss
remains almost the same even and no improvement in test accuracy
after 1000
iterations.
对于RMSPROP
,我已经更改了相应的参数,如这里.
For RMSPROP
, I have changed the respective parameters as mentioned here.
对于ADAM
,我已更改了相应的参数,如所述这里
For ADAM
, I have changed the respective parameters as mentioned here
对于ADADELTA
,我已经更改了相应的参数,如这里
For ADADELTA
, I have changed the respective parameters as mentioned here
有人可以告诉我我在做什么错吗?
Can someone please tell me what i am doing wrong?
推荐答案
我看到了与pir类似的结果:当给定与SGD使用的相同base_lr时,Adam会发散.当我将base_lr减小为其原始值的1/100时,Adam突然收敛,并给出了良好的结果.
I saw similar results to pir: Adam would diverge when given the same base_lr that SGD used. When I reduced base_lr to 1/100 of its original value, Adam suddenly converged, and gave good results.
这篇关于RMSprop,Adam,AdaDelta使用Caffe不会提高测试精度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!