全卷积网络,训练误差 [英] Fully Convolutional Network, Training Error

查看:29
本文介绍了全卷积网络,训练误差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很抱歉我不擅长英语.

我正在尝试使用 TensorFlow 构建我自己的全卷积网络.但是我很难用自己的图像数据训练这个模型,而 MNIST 数据却能正常工作.

这是我的 FCN 模型代码:(未使用预先训练或预先构建的模型)

 将 tensorflow 导入为 tf将 numpy 导入为 np

加载 MNIST 数据

 from tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("MNIST_data/", one_hot=True)images_flatten = tf.placeholder(tf.float32, shape=[None, 784])images = tf.reshape(images_flatten, [-1,28,28,1]) # CNN 处理 3 个维度标签 = tf.placeholder(tf.float32, shape=[None, 10])keep_prob = tf.placeholder(tf.float32) # 辍学率

卷积层

# Conv.第 1 层W1 = tf.Variable(tf.truncated_normal([3, 3, 1, 4], stddev = 0.1))b1 = tf.Variable(tf.truncated_normal([4], stddev = 0.1))FMA = tf.nn.conv2d(images, W1, strides=[1,1,1,1], padding='SAME')# FMA代表Fused Multiply Add,意思是卷积RELU = tf.nn.relu(tf.add(FMA, b1))池 = tf.nn.max_pool(RELU, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')# 转化率第 2 层W2 = tf.Variable(tf.truncated_normal([3, 3, 4, 8], stddev = 0.1))b2 = tf.Variable(tf.truncated_normal([8], stddev = 0.1))FMA = tf.nn.conv2d(POOL, W2, strides=[1,1,1,1], padding='SAME')RELU = tf.nn.relu(tf.add(FMA, b2))池 = tf.nn.max_pool(RELU, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')# 转化率第 3 层W3 = tf.Variable(tf.truncated_normal([7, 7, 8, 16], stddev = 0.1))b3 = tf.Variable(tf.truncated_normal([16], stddev = 0.1))FMA = tf.nn.conv2d(POOL, W3, strides=[1,1,1,1], padding='VALID')RELU = tf.nn.relu(tf.add(FMA, b3))# 退出Dropout = tf.nn.dropout(RELU,keep_prob)# 转化率第 4 层W4 = tf.Variable(tf.truncated_normal([1, 1, 16, 10], stddev = 0.1))b4 = tf.Variable(tf.truncated_normal([10], stddev = 0.1))FMA = tf.nn.conv2d(Dropout, W4, strides=[1,1,1,1], padding='SAME')LAST_RELU = tf.nn.relu(tf.add(FMA, b4))

<块引用>

总结:[Conv-ReLU-Pool] - [Conv-ReLU-Pool] - [Conv-ReLU] - [Dropout] - [Conv-ReLU]

定义损失、准确性

预测 = tf.squeeze(LAST_RELU)# 因为 FCN 在训练中返回 (1 x 1 x class_num)损失 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(预测,标签))# 第一个 arg 是 'logits=',另一个是 'labels='优化器 = tf.train.AdamOptimizer(0.001)train = optimizer.minimize(loss)label_max = tf.argmax(labels, 1)pred_max = tf.argmax(预测,1)right_pred = tf.equal(pred_max, label_max)精度 = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

训练模型

sess = tf.Session()sess.run(tf.global_variables_initializer())对于我在范围内(10000):image_batch, label_batch = mnist.train.next_batch(100)sess.run(train, feed_dict={images: image_batch, labels: label_batch, keep_prob: 0.8})如果我 % 10 == 0:tr = sess.run([loss,accuracy], feed_dict={images: image_batch, labels: label_batch, keep_prob: 1.0})打印(步骤 %d,损失 %g,准确度 %g" % (i, tr[0], tr[1]))

<块引用>

损失:0.784(大约)

准确率:94.8%(大约)

问题是,用 MNIST 数据训练这个模型效果很好,但用我自己的数据,损失总是一样的(0.6319),输出层总是 0.

代码没有区别,除了第三个卷积层的过滤器大小.该过滤器大小和由先前池化层压缩的输入大小必须具有相同的宽度高度.这就是为什么这一层的过滤器尺寸是[7,7].

我的模型有什么问题?..

两种情况(MNIST,我自己的数据)之间唯一不同的代码是:

占位符

我自己的数据有 (128 x 64 x 1) 并且标签是 'eyes', 'not_eyes'

images = tf.placeholder(tf.float32, [None, 128, 64, 1])标签 = tf.placeholder(tf.int32, [None, 2])

第三卷积层

W3 = tf.Variable(tf.truncated_normal([32, 16, 8, 16], stddev = 0.1))

喂料(批量)

image_data, label_data = input_data.get_batch(TRAINING_FILE, 10)sess = tf.Session()sess.run(tf.global_variables_initializer())坐标 = tf.train.Coordinator()线程 = tf.train.start_queue_runners(sess=sess, coord=coord)对于我在范围内(10000):image_batch, label_batch = sess.run([image_data, label_data])sess.run(train, feed_dict={images: image_batch, labels: label_batch, keep_prob: 0.8})if i % 10 == 0: ... # 验证部分也几乎相同...coord.request_stop()coord.join(线程)

这里的input_data"是同目录下的另一个python文件,get_batch(TRAINING_FILE, 10)"是返回批处理数据的函数.代码是:

def get_input_queue(txtfile_name):图像 = []标签 = []for line in open(txtfile_name, 'r'): # 这里的txt文件有数据的路径、标签、标签号cols = re.split(',|\n', line)标签.附加(int(cols [2]))images.append(tf.image.decode_jpeg(tf.read_file(cols[0]), channels = 1))input_queue = tf.train.slice_input_producer([图像,标签],shuffle = True)返回输入队列def get_batch(txtfile_name, batch_size):input_queue = get_input_queue(txtfile_name)图像 = input_queue[0]标签 = input_queue[1]image = tf.reshape(image, [128, 64, 1])批次图像,批次标签 = tf.train.batch([图像,标签],批次大小)batch_label_one_hot = tf.one_hot(tf.to_int64(batch_label), 2, on_value=1.0, off_value=0.0)返回batch_image,batch_label_one_hot

似乎没有任何问题....:(请帮帮我..!!

解决方案

您的输入是否适当缩放?.jpeg 在 [0-255] 范围内,需要缩放到 [-1 - 1].你可以试试:

 image = tf.reshape(image, [128, 64, 1])图像 = tf.scalar_mul((1.0/255), 图像)图像 = tf.subtract(图像,0.5)图像 = tf.multiply(图像,2.0)

I apologize that I'm not good at English.

I'm trying to build my own Fully Convolutional Network using TensorFlow. But I have difficulties on training this model with my own image data, whereas the MNIST data worked properly.

Here is my FCN model code: (Not using pre-trained or pre-bulit model)

import tensorflow as tf
import numpy as np

Loading MNIST Data

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

images_flatten = tf.placeholder(tf.float32, shape=[None, 784])

images = tf.reshape(images_flatten, [-1,28,28,1]) # CNN deals with 3 dimensions
labels = tf.placeholder(tf.float32, shape=[None, 10])
keep_prob = tf.placeholder(tf.float32) # Dropout Ratio

Convolutional Layers

# Conv. Layer #1
W1 = tf.Variable(tf.truncated_normal([3, 3, 1, 4], stddev = 0.1))
b1 = tf.Variable(tf.truncated_normal([4], stddev = 0.1))    
FMA = tf.nn.conv2d(images, W1, strides=[1,1,1,1], padding='SAME')
# FMA stands for Fused Multiply Add, which means convolution
RELU = tf.nn.relu(tf.add(FMA, b1))
POOL = tf.nn.max_pool(RELU, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

# Conv. Layer #2
W2 = tf.Variable(tf.truncated_normal([3, 3, 4, 8], stddev = 0.1))
b2 = tf.Variable(tf.truncated_normal([8], stddev = 0.1))    
FMA = tf.nn.conv2d(POOL, W2, strides=[1,1,1,1], padding='SAME')
RELU = tf.nn.relu(tf.add(FMA, b2))
POOL = tf.nn.max_pool(RELU, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

# Conv. Layer #3
W3 = tf.Variable(tf.truncated_normal([7, 7, 8, 16], stddev = 0.1))
b3 = tf.Variable(tf.truncated_normal([16], stddev = 0.1))   
FMA = tf.nn.conv2d(POOL, W3, strides=[1,1,1,1], padding='VALID')
RELU = tf.nn.relu(tf.add(FMA, b3))

# Dropout
Dropout = tf.nn.dropout(RELU, keep_prob)

# Conv. Layer #4
W4 = tf.Variable(tf.truncated_normal([1, 1, 16, 10], stddev = 0.1))
b4 = tf.Variable(tf.truncated_normal([10], stddev = 0.1))   
FMA = tf.nn.conv2d(Dropout, W4, strides=[1,1,1,1], padding='SAME')
LAST_RELU = tf.nn.relu(tf.add(FMA, b4))

Summary: [Conv-ReLU-Pool] - [Conv-ReLU-Pool] - [Conv-ReLU] - [Dropout] - [Conv-ReLU]

Define Loss, Accuracy

prediction = tf.squeeze(LAST_RELU) 
# Because FCN returns (1 x 1 x class_num) in training

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(prediction, labels))
# First arg is 'logits=' and the other one is 'labels='

optimizer = tf.train.AdamOptimizer(0.001)    
train = optimizer.minimize(loss)

label_max = tf.argmax(labels, 1)
pred_max = tf.argmax(prediction, 1)
correct_pred = tf.equal(pred_max, label_max)
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

Training Model

sess = tf.Session()
sess.run(tf.global_variables_initializer())

for i in range(10000):
   image_batch, label_batch = mnist.train.next_batch(100)
   sess.run(train, feed_dict={images: image_batch, labels: label_batch, keep_prob: 0.8})
   if i % 10 == 0:
       tr = sess.run([loss, accuracy], feed_dict={images: image_batch, labels: label_batch, keep_prob: 1.0})
       print("Step %d, Loss %g, Accuracy %g" % (i, tr[0], tr[1]))

Loss: 0.784 (Approximately)

Accuracy: 94.8% (Approximately)

The problem is that, training this model with MNIST data worked very well, but with my own data, loss is always same(0.6319), and the output layer is always 0.

There is no difference with the code, excepting for the third convolutional layer's filter size. This filter size and input size which is compressed by previous pooling layers, must have same width & height. That's why the filter size in this layer is [7,7].

What is wrong with my model?..

The only different code between two cases (MNIST, my own data) is:

Placeholder

My own data has (128 x 64 x 1) and the label is 'eyes', 'not_eyes'

images = tf.placeholder(tf.float32, [None, 128, 64, 1])
labels = tf.placeholder(tf.int32, [None, 2])

3rd Convolutional Layer

W3 = tf.Variable(tf.truncated_normal([32, 16, 8, 16], stddev = 0.1))

Feeding (Batch)

image_data, label_data = input_data.get_batch(TRAINING_FILE, 10)

sess = tf.Session()
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

for i in range(10000):
    image_batch, label_batch = sess.run([image_data, label_data])
    sess.run(train, feed_dict={images: image_batch, labels: label_batch, keep_prob: 0.8})
    if i % 10 == 0: ... # Validation part is almost same, too...

coord.request_stop()
coord.join(threads)

Here "input_data" is an another python file in the same directory, and "get_batch(TRAINING_FILE, 10)" is the function that returns batch data. The code is:

def get_input_queue(txtfile_name):
    images = []
    labels = [] 

    for line in open(txtfile_name, 'r'): # Here txt file has data's path, label, label number
        cols = re.split(',|\n', line)
        labels.append(int(cols[2]))
        images.append(tf.image.decode_jpeg(tf.read_file(cols[0]), channels = 1)) 

    input_queue = tf.train.slice_input_producer([images, labels], shuffle = True)
    return input_queue

def get_batch(txtfile_name, batch_size):
    input_queue = get_input_queue(txtfile_name)
    image = input_queue[0]
    label = input_queue[1]

    image = tf.reshape(image, [128, 64, 1])

    batch_image, batch_label = tf.train.batch([image, label], batch_size)
    batch_label_one_hot = tf.one_hot(tf.to_int64(batch_label), 2, on_value=1.0, off_value=0.0)
    return batch_image, batch_label_one_hot

It seems not to have any problem .... :( Please Help me..!!

解决方案

Are your inputs scaled appropriately?. The jpegs are in [0-255] range and it needs to be scaled to [-1 - 1]. You can try:

 image = tf.reshape(image, [128, 64, 1])
 image = tf.scalar_mul((1.0/255), image)
 image = tf.subtract(image, 0.5)
 image = tf.multiply(image, 2.0)

这篇关于全卷积网络,训练误差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆