训练网络时,Caffe在验证阶段的测试准确性是恒定的 [英] Caffe's test accuracy during validation phase being constant when training a network

查看:93
本文介绍了训练网络时,Caffe在验证阶段的测试准确性是恒定的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道为什么我的测试精度始终保持为0.5的恒定值.我使用CaffeNet网络时,仅更改了配置 num_output:2 的完全连接层的参数.

I wonder why my test accuracy keeps on getting a constant value of 0.5. I use CaffeNet network with only change in the fully connected layer's parameter where I configured num_output: 2.

我的训练集包含1000个阳性和1000个阴性示例,而我的验证集也包含1000个阳性和1000个阴性示例.数据集包含人物图像(全身RGB彩色).我已经在数据层中定义了平均文件和小数位值.我的网络经过训练可以学习一个人或不一个人(二进制分类器).

My training set contains 1000 positive and 1000 negative examples whereas my validation set has 1000 positive and 1000 negative examples as well. The dataset contains images of person (whole body RGB colored). I've defined a mean file and scale value in the data layer. My network is trained to learn a person or not (binary classifier).

我的求解器信息片段如下所示:

A snippet of my solver information looks like below:

test_iter: 80
test_interval: 10
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 20
display: 10
max_iter: 80
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000

培训结果如下:

I0228 11:49:27.411556  3422 solver.cpp:274] Learning Rate Policy: step
I0228 11:49:27.590368  3422 solver.cpp:331] Iteration 0, Testing net (#0)
I0228 11:53:29.203058  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 11:57:59.969632  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 11:58:26.602972  3422 solver.cpp:398]     Test net output #0: accuracy = 0.5
I0228 11:58:26.602999  3422 solver.cpp:398]     Test net output #1: loss = 0.726503 (* 1 = 0.726503 loss)
I0228 12:00:03.892771  3422 solver.cpp:219] Iteration 0 (-6.49109e-41 iter/s, 636.481s/10 iters), loss = 0.961699
I0228 12:00:03.892915  3422 solver.cpp:238]     Train net output #0: loss = 0.961699 (* 1 = 0.961699 loss)
I0228 12:00:03.892925  3422 sgd_solver.cpp:105] Iteration 0, lr = 0.01
I0228 12:04:28.831887  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 12:13:36.909935  3422 solver.cpp:331] Iteration 10, Testing net (#0)
I0228 12:17:36.894516  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 12:22:00.724030  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 12:22:27.375306  3422 solver.cpp:398]     Test net output #0: accuracy = 0.5
I0228 12:22:27.375334  3422 solver.cpp:398]     Test net output #1: loss = 0.698973 (* 1 = 0.698973 loss)
I0228 12:23:56.072116  3422 solver.cpp:219] Iteration 10 (0.00698237 iter/s, 1432.18s/10 iters), loss = 0.696559
I0228 12:23:56.072247  3422 solver.cpp:238]     Train net output #0: loss = 0.696558 (* 1 = 0.696558 loss)
I0228 12:23:56.072252  3422 sgd_solver.cpp:105] Iteration 10, lr = 0.01
I0228 12:25:23.664594  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 12:37:08.202978  3422 solver.cpp:331] Iteration 20, Testing net (#0)
I0228 12:41:05.859966  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 12:45:28.599306  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 12:45:55.524168  3422 solver.cpp:398]     Test net output #0: accuracy = 0.5
I0228 12:45:55.524190  3422 solver.cpp:398]     Test net output #1: loss = 0.693187 (* 1 = 0.693187 loss)
I0228 12:45:55.553427  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 12:47:24.159780  3422 solver.cpp:219] Iteration 20 (0.00710183 iter/s, 1408.09s/10 iters), loss = 0.690313
I0228 12:47:24.159914  3422 solver.cpp:238]     Train net output #0: loss = 0.690313 (* 1 = 0.690313 loss)
I0228 12:47:24.159920  3422 sgd_solver.cpp:105] Iteration 20, lr = 0.001
I0228 12:57:31.167225  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:00:23.671567  3422 solver.cpp:331] Iteration 30, Testing net (#0)
I0228 13:04:14.114737  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:08:30.406244  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:08:56.273648  3422 solver.cpp:398]     Test net output #0: accuracy = 0.5
I0228 13:08:56.273674  3422 solver.cpp:398]     Test net output #1: loss = 0.696971 (* 1 = 0.696971 loss)
I0228 13:10:28.487870  3422 solver.cpp:219] Iteration 30 (0.00722373 iter/s, 1384.33s/10 iters), loss = 0.700565
I0228 13:10:28.488041  3422 solver.cpp:238]     Train net output #0: loss = 0.700565 (* 1 = 0.700565 loss)
I0228 13:10:28.488049  3422 sgd_solver.cpp:105] Iteration 30, lr = 0.001
I0228 13:17:38.463490  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:23:29.700287  3422 solver.cpp:331] Iteration 40, Testing net (#0)
I0228 13:27:27.217670  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:31:48.651156  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:32:15.021637  3422 solver.cpp:398]     Test net output #0: accuracy = 0.5
I0228 13:32:15.021661  3422 solver.cpp:398]     Test net output #1: loss = 0.694784 (* 1 = 0.694784 loss)
I0228 13:33:43.542735  3422 solver.cpp:219] Iteration 40 (0.00716818 iter/s, 1395.05s/10 iters), loss = 0.700307
I0228 13:33:43.542875  3422 solver.cpp:238]     Train net output #0: loss = 0.700307 (* 1 = 0.700307 loss)
I0228 13:33:43.542897  3422 sgd_solver.cpp:105] Iteration 40, lr = 0.0001
I0228 13:36:37.602869  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:46:57.980952  3422 solver.cpp:331] Iteration 50, Testing net (#0)
I0228 13:50:55.125911  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:55:22.078013  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 13:55:49.644492  3422 solver.cpp:398]     Test net output #0: accuracy = 0.5
I0228 13:55:49.644516  3422 solver.cpp:398]     Test net output #1: loss = 0.693804 (* 1 = 0.693804 loss)
I0228 13:57:19.439967  3422 solver.cpp:219] Iteration 50 (0.00706266 iter/s, 1415.9s/10 iters), loss = 0.685755
I0228 13:57:19.440101  3422 solver.cpp:238]     Train net output #0: loss = 0.685755 (* 1 = 0.685755 loss)
I0228 13:57:19.440107  3422 sgd_solver.cpp:105] Iteration 50, lr = 0.0001
I0228 13:57:19.843221  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 14:09:13.012436  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 14:10:40.182121  3422 solver.cpp:331] Iteration 60, Testing net (#0)
I0228 14:14:37.148968  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 14:18:57.929569  3429 data_layer.cpp:73] Restarting data prefetching from start.
I0228 14:19:24.183915  3422 solver.cpp:398]     Test net output #0: accuracy = 0.5
I0228 14:19:24.183939  3422 solver.cpp:398]     Test net output #1: loss = 0.693612 (* 1 = 0.693612 loss)
I0228 14:20:51.017705  3422 solver.cpp:219] Iteration 60 (0.00708428 iter/s, 1411.58s/10 iters), loss = 0.693453
I0228 14:20:51.017838  3422 solver.cpp:238]     Train net output #0: loss = 0.693453 (* 1 = 0.693453 loss)
I0228 14:20:51.017845  3422 sgd_solver.cpp:105] Iteration 60, lr = 1e-05
I0228 14:29:34.635071  3426 data_layer.cpp:73] Restarting data prefetching from start.
I0228 14:34:02.693697  3422 solver.cpp:331] Iteration 70, Testing net (#0)
I0228 14:37:59.742414  3429 data_layer.cpp:73] Restarting data prefetching from start.

在遵循此参数(如果该参数与之相关,但仍无法解析).另外,我尝试通过使用修改后的 create_imagenet.sh 脚本重新生成数据集来重新整理数据,但问题仍然存在.

I also tried to change the value of test_iter to 40 (instead of previously set to 80) after following this link and this one if the parameter is related, but it still didn't resolve. Also, I tried to reshuffle the data by regenerating the dataset using a modified create_imagenet.sh script but the issue still remains.

每次更改求解器中的值时,我也会始终更改全连接层的名称.这是正确的方法吗?

Every time I changed value in the solver, I always changed the fully connected layer's name as well. Is this a correct way?

此处的纪元数约为10.可能是罪魁祸首吗?这种问题是否属于过度拟合问题?

The number of epoch here is ~10. Is it possible culprit? Does this kind of problem fall under over-fitting issue?

欢迎任何提示或建议.

我打开了求解器中的调试信息,发现损失很小.我可以推断出它没有学到很多东西吗?带有调试信息的日志如下:

I turned on the debug info in the solver and found the loss is infinitesimal. Can I deduce that it's not learning much or at all then? The log with the debug info is as below:

I0228 19:58:37.235631  6771 net.cpp:593]     [Forward] Layer pool2, top blob pool2 data: 1.00214
I0228 19:58:37.810919  6771 net.cpp:593]     [Forward] Layer norm2, top blob norm2 data: 1.00212
I0228 19:58:42.022397  6771 net.cpp:593]     [Forward] Layer conv3, top blob conv3 data: 0.432846
I0228 19:58:42.022722  6771 net.cpp:605]     [Forward] Layer conv3, param blob 0 data: 0.00796926
I0228 19:58:42.022725  6771 net.cpp:605]     [Forward] Layer conv3, param blob 1 data: 0.000184241
I0228 19:58:42.041185  6771 net.cpp:593]     [Forward] Layer relu3, top blob conv3 data: 0.2017
I0228 19:58:45.277812  6771 net.cpp:593]     [Forward] Layer conv4, top blob conv4 data: 0.989365
I0228 19:58:45.278079  6771 net.cpp:605]     [Forward] Layer conv4, param blob 0 data: 0.00797053
I0228 19:58:45.278082  6771 net.cpp:605]     [Forward] Layer conv4, param blob 1 data: 0.99991
I0228 19:58:45.296561  6771 net.cpp:593]     [Forward] Layer relu4, top blob conv4 data: 0.989365
I0228 19:58:47.495208  6771 net.cpp:593]     [Forward] Layer conv5, top blob conv5 data: 1.52664
I0228 19:58:47.495394  6771 net.cpp:605]     [Forward] Layer conv5, param blob 0 data: 0.00804997
I0228 19:58:47.495399  6771 net.cpp:605]     [Forward] Layer conv5, param blob 1 data: 0.996736
I0228 19:58:47.507951  6771 net.cpp:593]     [Forward] Layer relu5, top blob conv5 data: 0.128866
I0228 19:58:47.562223  6771 net.cpp:593]     [Forward] Layer pool5, top blob pool5 data: 0.151769
I0228 19:58:48.269973  6771 net.cpp:593]     [Forward] Layer fc6, top blob fc6 data: 0.95253
I0228 19:58:48.280905  6771 net.cpp:605]     [Forward] Layer fc6, param blob 0 data: 0.00397552
I0228 19:58:48.280917  6771 net.cpp:605]     [Forward] Layer fc6, param blob 1 data: 0.999847
I0228 19:58:48.282137  6771 net.cpp:593]     [Forward] Layer relu6, top blob fc6 data: 0.935909
I0228 19:58:48.286769  6771 net.cpp:593]     [Forward] Layer drop6, top blob fc6 data: 0.938786
I0228 19:58:48.602710  6771 net.cpp:593]     [Forward] Layer fc7, top blob fc7 data: 3.76741
I0228 19:58:48.607655  6771 net.cpp:605]     [Forward] Layer fc7, param blob 0 data: 0.00411323
I0228 19:58:48.607664  6771 net.cpp:605]     [Forward] Layer fc7, param blob 1 data: 0.997461
I0228 19:58:48.608860  6771 net.cpp:593]     [Forward] Layer relu7, top blob fc7 data: 3.41694e-06
I0228 19:58:48.613621  6771 net.cpp:593]     [Forward] Layer drop7, top blob fc7 data: 3.15335e-06
I0228 19:58:48.615514  6771 net.cpp:593]     [Forward] Layer fc8_new15, top blob fc8_new15 data: 0.0446082
I0228 19:58:48.615520  6771 net.cpp:605]     [Forward] Layer fc8_new15, param blob 0 data: 0.0229027
I0228 19:58:48.615522  6771 net.cpp:605]     [Forward] Layer fc8_new15, param blob 1 data: 0.0444381
I0228 19:58:48.615579  6771 net.cpp:593]     [Forward] Layer loss, top blob loss data: 0.693174
I0228 19:58:48.615586  6771 net.cpp:621]     [Backward] Layer loss, bottom blob fc8_new15 diff: 0.00195124
I0228 19:58:48.617902  6771 net.cpp:621]     [Backward] Layer fc8_new15, bottom blob fc7 diff: 8.65365e-05
I0228 19:58:48.617914  6771 net.cpp:632]     [Backward] Layer fc8_new15, param blob 0 diff: 8.20022e-07
I0228 19:58:48.617916  6771 net.cpp:632]     [Backward] Layer fc8_new15, param blob 1 diff: 0.0105705
I0228 19:58:48.619067  6771 net.cpp:621]     [Backward] Layer drop7, bottom blob fc7 diff: 8.65526e-05
I0228 19:58:48.620265  6771 net.cpp:621]     [Backward] Layer relu7, bottom blob fc7 diff: 1.21017e-09
I0228 19:58:49.261282  6771 net.cpp:621]     [Backward] Layer fc7, bottom blob fc6 diff: 2.00745e-08
I0228 19:58:49.266103  6771 net.cpp:632]     [Backward] Layer fc7, param blob 0 diff: 1.43563e-07
I0228 19:58:49.266114  6771 net.cpp:632]     [Backward] Layer fc7, param blob 1 diff: 9.29627e-08
I0228 19:58:49.267330  6771 net.cpp:621]     [Backward] Layer drop6, bottom blob fc6 diff: 1.99176e-08
I0228 19:58:49.268508  6771 net.cpp:621]     [Backward] Layer relu6, bottom blob fc6 diff: 1.85305e-08
I0228 19:58:50.779518  6771 net.cpp:621]     [Backward] Layer fc6, bottom blob pool5 diff: 8.8138e-09
I0228 19:58:50.790220  6771 net.cpp:632]     [Backward] Layer fc6, param blob 0 diff: 3.01911e-07
I0228 19:58:50.790235  6771 net.cpp:632]     [Backward] Layer fc6, param blob 1 diff: 1.99256e-06
I0228 19:58:50.813318  6771 net.cpp:621]     [Backward] Layer pool5, bottom blob conv5 diff: 1.84585e-09
I0228 19:58:50.826406  6771 net.cpp:621]     [Backward] Layer relu5, bottom blob conv5 diff: 3.86034e-10
I0228 19:58:55.093768  6771 net.cpp:621]     [Backward] Layer conv5, bottom blob conv4 diff: 5.76684e-10
I0228 19:58:55.093967  6771 net.cpp:632]     [Backward] Layer conv5, param blob 0 diff: 1.47824e-06
I0228 19:58:55.093973  6771 net.cpp:632]     [Backward] Layer conv5, param blob 1 diff: 1.92951e-06
I0228 19:58:55.114212  6771 net.cpp:621]     [Backward] Layer relu4, bottom blob conv4 diff: 5.76684e-10
I0228 19:59:01.392058  6771 net.cpp:621]     [Backward] Layer conv4, bottom blob conv3 diff: 2.31243e-10
I0228 19:59:01.392359  6771 net.cpp:632]     [Backward] Layer conv4, param blob 0 diff: 1.76617e-07
I0228 19:59:01.392364  6771 net.cpp:632]     [Backward] Layer conv4, param blob 1 diff: 8.78101e-07
I0228 19:59:01.412240  6771 net.cpp:621]     [Backward] Layer relu3, bottom blob conv3 diff: 8.56331e-11
I0228 19:59:09.734658  6771 net.cpp:621]     [Backward] Layer conv3, bottom blob norm2 diff: 7.87699e-11
I0228 19:59:09.735258  6771 net.cpp:632]     [Backward] Layer conv3, param blob 0 diff: 1.33159e-07
I0228 19:59:09.735270  6771 net.cpp:632]     [Backward] Layer conv3, param blob 1 diff: 1.47704e-07
I0228 19:59:10.390552  6771 net.cpp:621]     [Backward] Layer norm2, bottom blob pool2 diff: 7.87615e-11
I0228 19:59:10.452433  6771 net.cpp:621]     [Backward] Layer pool2, bottom blob conv2 diff: 1.50474e-11
I0228 19:59:10.516407  6771 net.cpp:621]     [Backward] Layer relu2, bottom blob conv2 diff: 1.50474e-11
I0228 19:59:20.241587  6771 net.cpp:621]     [Backward] Layer conv2, bottom blob norm1 diff: 2.07819e-11
I0228 19:59:20.241801  6771 net.cpp:632]     [Backward] Layer conv2, param blob 0 diff: 3.61894e-09
I0228 19:59:20.241807  6771 net.cpp:632]     [Backward] Layer conv2, param blob 1 diff: 1.05108e-07
I0228 19:59:35.405725  6771 net.cpp:621]     [Backward] Layer norm1, bottom blob pool1 diff: 2.07819e-11
I0228 19:59:35.494249  6771 net.cpp:621]     [Backward] Layer pool1, bottom blob conv1 diff: 4.26e-12
I0228 19:59:35.585350  6771 net.cpp:621]     [Backward] Layer relu1, bottom blob conv1 diff: 3.25633e-12
I0228 19:59:38.335880  6771 net.cpp:632]     [Backward] Layer conv1, param blob 0 diff: 9.37551e-09
I0228 19:59:38.335896  6771 net.cpp:632]     [Backward] Layer conv1, param blob 1 diff: 5.86281e-08
E0228 19:59:38.411557  6771 net.cpp:721]     [Backward] All net params (data, diff): L1 norm = (246967, 14.733); L2 norm = (103.38, 0.0470958)
I0228 19:59:38.411592  6771 solver.cpp:219] Iteration 70 (0.00886075 iter/s, 1128.57s/10 iters), loss = 0.693174
I0228 19:59:38.411600  6771 solver.cpp:238]     Train net output #0: loss = 0.693174 (* 1 = 0.693174 loss)
I0228 19:59:38.411605  6771 sgd_solver.cpp:105] Iteration 70, lr = 1e-05
I0228 20:05:17.468423  6775 data_layer.cpp:73] Restarting data prefetching from start.

推荐答案

data_layer.cpp:73] Restarting data prefetching from start.

当作为数据层输入的.txt文件到达文件末尾时,会出现上述消息.

The above message occurs when the .txt file that is given as input to data layer reached the end of file.

在以下情况下,该消息可能会频繁出现:

This message can occur frequently when:

  1. 您将错误的.txt文件提供给数据层
  2. .txt文件的格式与Caffe所期望的不一样
  3. 文件中的数据很少.

这篇关于训练网络时,Caffe在验证阶段的测试准确性是恒定的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆