CNN架构:将“良好"分类为和“不良"图片 [英] CNN architecture: classifying "good" and "bad" images
问题描述
我正在研究实现CNN以便将图像分类为好"或坏"的可能性,但是我目前的体系结构没有运气.
I'm researching the possibility of implementing a CNN in order to classify images as "good" or "bad" but am having no luck with my current architecture.
表示不良"图像的特征:
Characteristics that denote a "bad" image:
- 曝光过度
- 过饱和
- 白平衡不正确
- 模糊
实现神经网络基于这些特征对图像进行分类是否可行,还是最好让传统算法简单地查看整个图像的亮度/对比度变化并以此方式分类?
Would it be feasible to implement a neural network to classify images based on these characteristics or is it best left to a traditional algorithm that simply looks at the variance in brightness/contrast throughout an image and classifies it that way?
我曾尝试使用VGGNet架构来训练CNN,但是无论时长或步数如何,我似乎总是有偏见且不可靠.
I have attempted training a CNN using the VGGNet architecture but I always seem to get a biased and unreliable model, regardless of the number of epochs or number of steps.
示例:
我当前模型的体系结构非常简单(因为我是整个机器学习领域的新手),但似乎可以与其他分类问题配合使用,并且我对其进行了少许修改以更好地解决此二进制分类问题:
My current model's architecture is very simple (as I am new to the whole machine learning world) but seemed to work fine with other classification problems, and I have modified it slightly to work better with this binary classification problem:
# CONV => RELU => POOL layer set
# define convolutional layers, use "ReLU" activation function
# and reduce the spatial size (width and height) with pool layers
model.add(Conv2D(32, (3, 3), padding="same", input_shape=input_shape)) # 32 3x3 filters (height, width, depth)
model.add(Activation("relu"))
model.add(BatchNormalization(axis=channel_dimension))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25)) # helps prevent overfitting (25% of neurons disconnected randomly)
# (CONV => RELU) * 2 => POOL layer set (increasing number of layers as you go deeper into CNN)
model.add(Conv2D(64, (3, 3), padding="same", input_shape=input_shape)) # 64 3x3 filters
model.add(Activation("relu"))
model.add(BatchNormalization(axis=channel_dimension))
model.add(Conv2D(64, (3, 3), padding="same", input_shape=input_shape)) # 64 3x3 filters
model.add(Activation("relu"))
model.add(BatchNormalization(axis=channel_dimension))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25)) # helps prevent overfitting (25% of neurons disconnected randomly)
# (CONV => RELU) * 3 => POOL layer set (input volume size becoming smaller and smaller)
model.add(Conv2D(128, (3, 3), padding="same", input_shape=input_shape)) # 128 3x3 filters
model.add(Activation("relu"))
model.add(BatchNormalization(axis=channel_dimension))
model.add(Conv2D(128, (3, 3), padding="same", input_shape=input_shape)) # 128 3x3 filters
model.add(Activation("relu"))
model.add(BatchNormalization(axis=channel_dimension))
model.add(Conv2D(128, (3, 3), padding="same", input_shape=input_shape)) # 128 3x3 filters
model.add(Activation("relu"))
model.add(BatchNormalization(axis=channel_dimension))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25)) # helps prevent overfitting (25% of neurons disconnected randomly)
# only set of FC => RELU layers
model.add(Flatten())
model.add(Dense(512))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
# sigmoid classifier (output layer)
model.add(Dense(classes))
model.add(Activation("sigmoid"))
该模型是否存在明显的遗漏或错误,或者我是否不能使用深度学习(使用我当前的GPU,GTX 970)解决此问题?
Is there any glaring omissions or mistakes with this model or can I simply not solve this problem using deep learning (with my current GPU, a GTX 970)?
感谢您的时间和经验,
乔什
这是我用于编译/训练模型的代码:
Here is my code for compiling/training the model:
# initialise the model and optimiser
print("[INFO] Training network...")
opt = SGD(lr=initial_lr, decay=initial_lr / epochs)
model.compile(loss="sparse_categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
# set up checkpoints
model_name = "output/50_epochs_{epoch:02d}_{val_acc:.2f}.model"
checkpoint = ModelCheckpoint(model_name, monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
patience=5, min_lr=0.001)
tensorboard = TensorBoard(log_dir="logs/{}".format(time()))
callbacks_list = [checkpoint, reduce_lr, tensorboard]
# train the network
H = model.fit_generator(training_set, steps_per_epoch=500, epochs=50, validation_data=test_set, validation_steps=150, callbacks=callbacks_list)
推荐答案
我建议您进行迁移学习,而不要训练整个网络.使用在 ImageNet
I suggest you go for transfer learning instead of training the whole network. use the weights trained on a huge Dataset like ImageNet
您可以使用Keras轻松地做到这一点,只需要导入权重如 xception 的模型,然后将代表1000个类别的imagenet数据集的最后一层删除为2个节点密集层,因为您只有2个类别,为基础层设置 trainable = False
,为自定义添加的层设置 trainable = True
,例如具有node = 2的密集层.
you can easily do this using Keras you just need to import model with weights like xception and remove last layer which represents 1000 classes of imagenet dataset to 2 node dense layer cause you have only 2 classes and set trainable=False
for the base layer and trainable=True
for custom added layers like dense layer having node = 2.
您可以像往常一样训练模型.
and you can train the model as usual way.
演示代码-
from keras.applications import *
from keras.models import Model
base_model = Xception(input_shape=(img_width, img_height, 3), weights='imagenet', include_top=False
x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(2, activation='softmax')(x)
model = Model(base_model.input, predictions)
# freezing the base layer weights
for layer in base_model.layers:
layer.trainable = False
这篇关于CNN架构:将“良好"分类为和“不良"图片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!