使用Keras在卷积神经网络中始终相同的二元类预测的原因和可能的解决方案是什么? [英] What are the causes and possible solutions to always same binary class prediction in Convolutional Neural Network using Keras?
问题描述
我正在尝试解决签名识别问题.我使用GPDS数据库,合并了正版和伪造签名的所有组合,从而产生了200万个200x200像素图像的输入.
Im trying to solve a signature recognition problem. Im using GPDS database and I merged all the combinations of Genuine and Forgery signatures that resulted in a 4 million inputs of 200x200 pixel image.
我使用Keras创建了一个基本的CNN,由于硬件的限制,Im仅使用了大约5000个输入和最多10个时期进行训练.我的问题是,当我开始训练模型(model.fit命令)时,我的准确度大约在50%左右,这是我的数据集的余量,而当纪元结束时,准确度恰好是50%.当我尝试在训练后预测一些结果时,预测都是相同的(例如,全1表示真实签名).
I created a basic CNN using Keras and, due to the limitations of my hardware, Im using just around 5000 inputs and a maximum of 10 epochs for training. My problem is that, when I start training the model (model.fit command), my accuracy varies around the 50% which is the balance of my dataset and when the epoch finishes, the accuracy is exactly 50%. When I try to predict some results after training, the predictions are all the same (all 1s which means genuine signature, for example).
不确定是否存在以下问题:
Not sure if it is a problem of:
- 本地最小值
- 关于问题复杂性的小型数据集
- 权重,学习率,动量的初始化值错误...
- 培训不足
- 解决问题的网络非常简单
与神经网络合作不是我的新手,所以也许这只是一个基本问题,任何人都可以帮助我吗?
Im new in working with Neural Networks so maybe it is just basic problem, anyway, could anyone help me??
代码如下:
model = Sequential()
model.add(Conv2D(100, (5, 5), input_shape=(1, 200, 200), activation='relu', data_format='channels_first'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=’adam’, metrics=['accuracy'])
model.fit(x = x, y = y, batch_size = 100, shuffle = True, epochs=10)
推荐答案
您在sigmoid
之前使用relu
激活(max(0,x)),我的猜测是(取决于如何初始化图层) ),您的S形饱和.
You are using relu
activation (max(0,x)) before a sigmoid
, my guess is that (depending on how your layers were initialized) you are saturating the sigmoid.
将S型曲线饱和会导致零梯度,因此就不会学习.
Saturating a sigmoid results in null gradients and thus, no learning.
这篇关于使用Keras在卷积神经网络中始终相同的二元类预测的原因和可能的解决方案是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!