为什么我的CNN总是返回相同的结果? [英] Why my CNN returns always the same result?

查看:134
本文介绍了为什么我的CNN总是返回相同的结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建一个CNN,将对象分为3个主要类别.这三个对象包括兰博基尼,汽缸盖和一块飞机.我的数据集包含6580张图像,每个类别几乎有2200张图像.您可以在Google驱动器上查看我的数据集

I'm trying to build a CNN that classify object in 3 main classes.The three objects consist of a lamborghini , cylinder head and a piece of plane. My data set consists of 6580 images , almost 2200 image for each class.You can see my dataset on google drive dataset. The architecture of my CNN is AlexNet , but I've modified the output of fully connected layer 8 from 1000 to 3. I have used these settings for training

test_iter:1000
test_interval:1000
base_lr:0.001
lr_policy:"step"
gamma:0.1
stepsize:2500
max_iter:40000
momentum:0.9
weight_decay:0.0005

但是,问题是当我在训练后部署模型时,结果始终是以下{'prob': array([[ 0.33333334, 0.33333334, 0.33333334]], dtype=float32)}.

But , the problem is when I deploy my model after training the result is always the following {'prob': array([[ 0.33333334, 0.33333334, 0.33333334]], dtype=float32)}.

下面的代码是我的脚本,用于加载模型并输出概率向量.

the code below , is my script to load the model and output the vector of probabilities.

import numpy as np
import matplotlib.pyplot as plt
import sys
import caffe
import cv2

MODEL_FILE ='deploy_ex0.prototxt'
PRETRAINED='snapshot_ex0_1_model_iter_40000.caffemodel'

caffe.set_mode_cpu()
net = caffe.Net(MODEL_FILE, PRETRAINED, caffe.TEST)

#preprocessing 

transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

#mean substraction 

mean_file = np.array([104,117,123]) 
transformer.set_mean('data', mean_file)

transformer.set_transpose('data', (2,0,1))
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)

#batch size 
net.blobs['data'].reshape(1,3,227,227)

#load image in data layer 

im=cv2.imread('test.jpg', cv2.IMREAD_COLOR)
img =cv2.resize(im, (227,227))

net.blobs['data'].data[...] = transformer.preprocess('data', img)

#compute 

out=net.forward()

print out

我想知道为什么会有这样的结果?您能帮我调试CNN吗?

I am wondering why I have a result like this ? would you help me to debug my CNN ?

而且,经过培训,我得到了这些结果

Also, after training I got these results

I0421 06:56:12.285953  2224 solver.cpp:317] Iteration 40000, loss = 5.06557e-05
I0421 06:56:12.286027  2224 solver.cpp:337] Iteration 40000, Testing net (#0)
I0421 06:58:32.159469  2224 solver.cpp:404]     Test net output #0: accuracy = 0.99898
I0421 06:58:32.159626  2224 solver.cpp:404]     Test net output #1: loss = 0.00183688 (* 1 = 0.00183688 loss)
I0421 06:58:32.159643  2224 solver.cpp:322] Optimization Done.
I0421 06:58:32.159654  2224 caffe.cpp:222] Optimization Done.

谢谢

在5月11日的回答后进行

我使用了一个简单的模型1 conv,1 reul,1 pool和2个完全连接的层.以下代码是体系结构规范:

I used a simple model 1 conv , 1 reul , 1 pool and 2 fully connected layers.. The code below is the architecture specification :

name:"CNN"
layer {
  name: "convnet"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror:true
    crop_size:227
    mean_value:87.6231
    mean_value:87.6757

    mean_value:87.1677
    #mean_file:"/home/jaba/caffe/data/diota_model/mean.binaryproto"
  }
  data_param {
    source: "/home/jaba/caffe/data/diota_model/train_lmdb"
    batch_size: 32
    backend: LMDB
  }
}

layer {
  name: "convnet"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror:true
    crop_size:227
    mean_value:87.6231
    mean_value:87.6757

    mean_value:87.1677
    #mean_file:"/home/jaba/caffe/data/diota_model/mean.binaryproto"
  }
  data_param {
    source: "/home/jaba/caffe/data/diota_model/val_lmdb"
    batch_size: 20
    backend: LMDB
  }
}

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}

layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}

layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 300
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}

layer 
{
   name:"ip2"
   type:"InnerProduct"
   bottom:"ip1"
   top:"ip2"
   param
   {
    lr_mult:1
   }
   param
   {
    lr_mult:2
   }
   inner_product_param 
   {
    num_output: 3
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
   }

}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip1"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}

layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip1"
  bottom: "label"
  top: "loss"
}

我训练了22个时期的CNN,准确率达到了86%.对于我使用的求解器参数:

I trained this CNN for 22 epochs and I got accuracy 86 %. For the solver parameters I used :

net: "/home/jaba/caffe/data/diota_model/simple_model/train_val.prototxt"
test_iter: 50
test_interval: 100
base_lr: 0.00001
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 3500
snapshot: 100
snapshot_prefix: "/home/jaba/caffe/data/diota_model/simple_model/snap_shot_model"
solver_mode: GPU

现在,当我部署模型时,它不会返回相同的概率向量.但是,有一个问题是,当我加载模型并在validation_lmdb文件夹中对其进行测试时,我没有获得相同的精度值,我得到了将近56%的精度.

Now , when I deploy the model it does not return the same vector of probabilities. But , there is one issue , is when I loaded the model and I tested it on validation_lmdb folder , I did not get the same accuracy value , I got almost 56% .

我使用以下脚本计算准确性:

I used the script below to calculate the accuracy :

import os
import glob
#import cv2
import caffe
import lmdb
import numpy as np
from caffe.proto import caffe_pb2

MODEL_FILE ='deploy.prototxt'
PRETRAINED='snap_shot_model_iter_3500.caffemodel'

caffe.set_mode_cpu()
#load_model

net = caffe.Net(MODEL_FILE, PRETRAINED, caffe.TEST)

#load input and configure preprocessing



#mean_file = np.array([104,117,123])

transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
#transformer.set_mean('data', mean_file)
transformer.set_transpose('data', (2,0,1))
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)


#fixing the batch size

net.blobs['data'].reshape(1,3,227,227)

lmdb_env=lmdb.open('/home/jaba/caffe/data/diota_model/val1_lmdb')

lmdb_txn=lmdb_env.begin()

lmdb_cursor=lmdb_txn.cursor()

datum=caffe_pb2.Datum()


correct_predictions=0

for key,value in lmdb_cursor:

    datum.ParseFromString(value)

    label=datum.label
    data=caffe.io.datum_to_array(datum)

    image=np.transpose(data,(1,2,0))


    net.blobs['data'].data[...]=transformer.preprocess('data',image)

    out=net.forward()
    out_put=out['prob'].argmax()
    if label==out_put:
    correct_predictions=correct_predictions+1



print 'accuracy :'
print correct_predictions/1002.0

我更改了用于测试的数据集1002和用于学习的数据集4998的大小. 您能给我一些解决问题的建议吗?

I changed the size of the data set 1002 for testing and 4998 for learning . Would you give me some suggestions to solve the issue ?

谢谢!

推荐答案

我认为我看到了两个不同的问题,即过度拟合的不同形式.在6580张图像中有85%用于训练,其中5593处于训练中,987处于测试中.

I think I see two distinct problems, different forms of over-fitting. WIth 85% of your 6580 images for training, you have 5593 in training, 987 in testing.

一个

40000次迭代*(256个图像/迭代)*(1个纪元/5593个图像)〜= 1831个纪元. 在ILSVRC数据集(128万张图像)上,AlexNet仅训练40至50个纪元(取决于横向扩展). 您的模型实际上损失了0,并且在整个测试集中只有1张错误的照片.

40000 iterations * (256 images/iteration) * (1 epoch/5593 images) ~= 1831 epochs. On the ILSVRC data set (1.28M images), AlexNet trains for only 40-50 epochs (depending on scale-out). Your model finished with a loss of effectively 0 and got only 1 photo wrong in the entire testing set.

两个

AlexNet的宽度(每层过滤器)针对ILSVRC数据集的1000个类别和众多功能进行了调整.您尚未按比例缩小数据.第5层扩大到4096个滤镜:每个图像近乎一个. ILSVRC训练Alexnet识别猫脸,轮式车辆的一侧等特征时-您的模型将训练以从正面30度,水平上方8度的角度识别深蓝色的兰博基尼(Lambourghini),并带有草驾驶员侧的背景中的杨树和背景中的杨树.

AlexNet's widths (filters per layer) are tuned for the 1000 classes and myriad features of the ILSVRC data set. You haven't scaled it down for your data. Layer 5 broadens to 4096 filters: that's nearly one for each image. Where ILSVRC trains Alexnet to recognize features such as a feline face, one side of a wheeled vehicle, etc. -- your model will train to recognize a dark blue Lambourghini from an angle of 30 degrees off front, 8 degrees above horizontal, with grass in the background and a poplar tree in the background on the driver's side.

换句话说,您训练有素的AlexNet像浇筑塑料外壳一样适合训练数据集.除了初始数据集之外,它都不能容纳.

In other words, your trained AlexNet fits the training data set like a pour-on plastic shell. It's not going to fit anything except the initial data set.

令我感到惊讶的是,它在其他汽车,其他汽缸盖和飞机零件上的表现不佳.但是,我已经看到足够多的过拟合模型,它们具有有效的随机输出.

I'm mildly surprised that it doesn't do a little better on other autos, other cylinder heads, and plane pieces. However, I've seen enough over-fitted models that had effectively random output.

首先,减少培训时间.其次,尝试减小每个图层的 num_output 大小.

First, reduce the length of training. Second, try reducing the num_output size of each layer.

在OP的评论中于5月11日编辑

是的,您必须减少每层中的内核/过滤器/输出的数量.尤其是第5层,具有4K滤镜,这意味着网络可以为数据集中的每张照片分配近1个滤镜.这并不能有效地学习:您没有1000个过滤器来学习垫圈的功能,而是拥有1000多个过滤器,每个过滤器都可以学习特定垫圈照片的一个非常特定的功能.

Yes, you have to reduce the number of kernels/filters/outputs in each layer. Layer 5, in particular, has 4K filters, which means that the network can allocate almost 1 filter per photo in your data set. This does not make for effective learning: instead of having a handful of filters that learn features of gaskets, you have over 1000 filters, each learning one very specific feature of a particular gasket photo.

AlexNet,GoogleNet,ResNet,VGG等均已针对各种对象上的静止图像的一般区分问题进行了构建和调整.您当然可以使用一般概念,但是对于那些体积更小,定义更好的问题,它们不是很好的拓扑.

AlexNet, GoogleNet, ResNet, VGG, et alia were all built and tuned for a problem of general discrimination of still images over a wide variety of objects. You can certainly use the general concepts, but they are not good topologies to use for a problem that is so much smaller and better defined.

这篇关于为什么我的CNN总是返回相同的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆