Tensorflow:简单的3D Convnet无法学习 [英] Tensorflow: Simple 3D Convnet not learning

查看:130
本文介绍了Tensorflow:简单的3D Convnet无法学习的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个用于图像分割的简单3D U-net ,只是为了学习如何使用图层。因此,我使用步幅2进行了3D卷积,然后进行了转置反卷积以恢复相同的图像大小。我也正适合一小套(测试套),只是为了查看我的网络是否正在学习。



我在Keras中创建了相同的网络,效果很好。现在我想在tensorflow中创建,但是遇到了麻烦。



成本稍有变化,但是无论我做什么(减少学习速度,添加更多纪元,添加更多图层,更改批次大小...)输出始终相同。我相信网不会更新权重。我确定我做错了什么,但是我可以找到它是什么。任何帮助将不胜感激。



这里是我的代码

  def forward_propagation(X):

if(mode =='train'):print( ---------净---- -----)

#卷积层1
,带有tf.variable_scope('CONV1'):
Z1 = tf.layers.conv3d(X,过滤器= 16 ,内核= [3,3,3],步幅= [2,2,2],填充='SAME',名称='S2 / conv3d')
A1 = tf.nn.relu(Z1,名称='S2 / ReLU')
if(mode =='train'):print(卷积层1 S2 + str(A1.get_shape()))

#DE卷积层1
with tf.variable_scope('DeCONV1'):
output_deconv1 = tf.stack([X.get_shape()[0],X.get_shape()[1],X.get_shape()[ 2],X.get_shape()[3],1])$ ​​b $ b dZ1 = tf.nn.conv3d_transpose(A1,过滤器= 1,内核= [3,3,3],步幅= [2,2, 2],padding ='SAME',名称='S2 / conv3d_transpose')
dA1 = tf.nn.relu(dZ1,名称='S2 / ReLU')

if(mode =='train'):print(去卷积层1 S1 + str(dA1.get_shape()))

return dA1


def compute_cost(输出,目标,方法='dice_hard_coe'):

与tf.variable_scope('COST'):

if(method =='sigmoid_cross_entropy'):
#使它们成为向量
output = tf.reshape(output,[-1,output.get_shape()。as_list()[0]])
target = tf.reshape(target,[-1 ,target.get_shape()。as_list()[0]])
损失= tf.nn.sigmoid_cross_entropy_with_logits(登录数=输出,标签=目标)
成本= tf.reduce_mean(损失)

的返回成本

以及模型的主要功能

  def模型(X_h5,Y_h5,learning_rate = 0.009,
num_epochs = 100,minibatch_size = 64,print_cost = True):


ops.reset_default_graph()#能够重新运行不覆盖tf变量的模型
#tf.set_random_seed(1)#保持结果一致(张量流种子)
#seed = 3#保持结果一致(numpy种子)
(m,n_D ,n_H,n_W,num_channels)= X_h5 [ test_data]。shape #TTT
num_labels = Y_h5 [ test_mask]。shape [4] #TTT
img_size = Y_h5 [ test_mask]。 shape [1] #TTT
的成本= []#跟踪成本
的精度= []#跟踪精度



#创建正确形状的占位符
X,Y = create_placeholders(n_H,n_W,n_D,minibatch_size)

#正向传播:在张量流图中构建正向传播
nn_output = forward_propagation(X)
预测= tf.nn.sigmoid(nn_output)

#成本函数:将成本函数添加到tensorfl图
cost_method ='sigmoid_cross_entropy'
cost = compute_cost(nn_output,Y,cost_method)

#反向传播:定义张量流优化器。使用一个使成本最小化的AdamOptimizer。
Optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)

#全局初始化所有变量
init = tf.global_variables_initializer()


#开始会话以tf.Session()作为会话来计算张量流图


print('------培训- ----')

#运行初始化
tf.local_variables_initializer()。run(session = sess)
sess.run(init)

#是否在范围(num_epochs * m)内为i进行训练循环

#-----火车-------
current_epoch = i // m

Patient_start = i-(current_epoch * m)
Patient_end = Patient_start + minibatch_size

current_X_train = np.zeros((minibatch_size,n_D,n_H,n_W,num_channels) )
current_X_train [:,:,:,:::] = np.array(X_h5 [ test_data] [Patient_start:Patient_end,:,:,:,:])#TTT
current_X_train = np.nan_to_num(current_X_train)#使nan为零

current_Y_t​​rain = np.zeros((minibatch_size,n_D,n_H,n_W,num_labels))
current_Y_t​​rain [:,:: ,:,:,:] = np.array(Y_h5 [ test_mask] [Patient_start:Patient_end,:,:,:,:])#TTT
current_Y_t​​rain = np.nan_to_num(current_Y_t​​rain)#使nan为零

feed_dict = {X:current_X_train,Y:current_Y_t​​rain}
_,temp_cost = sess.run([optimizer,cost],feed_dict = feed_dict)

# -----测试-------
#如果((i%(num_epochs * m / 5)== 0):

#计算预测
test_predictions = np.zeros(Y_h5 [ test_mask]。shape)

在范围(0,X_h5 [ test_data]]中的j .shape [0],最小批量大小):

Patient_start = j
Patient_end = pati ent_start + minibatch_size

current_X_test = np.zeros((minibatch_size,n_D,n_H,n_W,num_channels))
current_X_test [:,:,:,::] = np.array( X_h5 [ test_data] [Patient_start:patient_end,:,:,:,:])
current_X_test = np.nan_to_num(current_X_test)#使nan为零

current_Y_t​​est = np.zeros( (minibatch_size,n_D,n_H,n_W,num_labels))
current_Y_t​​est [:,:,::,::] = np.array(Y_h5 [ test_mask] [Patient_start:Patient_end,:,::,:, :])
current_Y_t​​est = np.nan_to_num(current_Y_t​​est)#使nan零

feed_dict = {X:current_X_test,Y:current_Y_t​​est}
_,current_prediction = sess.run( [费用,预测],feed_dict = feed_dict)
test_predictions [j:j + minibatch_size,:,:,:,:] = current_prediction

cost.append(temp_cost)
打印( [ + str(current_epoch)+ | + str(num_epochs)+] +成本: + str(costs [-1]))
display_progress(X_h5 [ test_data],Y_h5 [ test_mask ],test_predictions,5,n_H,n_W)

#绘制成本
plt.plot(np.squeeze(costs))
plt.ylabel('cost')
plt.xlabel('epochs')
plt.show()

return

我用以下模型调用模型:

  model(hdf5_data_file,hdf5_mask_file,num_epochs = 500,minibatch_size = 1,learning_rate = 1e-3)

这些是我当前得到的结果:



Edit :
我尝试降低学习率,但没有帮助。我还尝试使用张量板调试,权重未更新:





<我不确定为什么会这样。
我在keras中创建了相同的简单模型,并且工作正常。我不确定在tensorflow中我做错了什么。

解决方案

不确定我是否还在寻求帮助您发布日期的半年后回答此问题。 :)我已经列出了我的观察结果,并为您提供了一些建议,供您尝试以下。我的主要观察是正确的...那么您可能只需要喝咖啡休息一下/睡个好觉。



主要观察:




  • tf.reshape(output,[-1,output.get_shape()。as_list()[0]])似乎是错误的。如果您希望平整向量,则应该是 tf.reshape(output,[-1,np.prod(image_shape_list)])



其他观察结果:



$ b

  • 在如此浅薄的网络中,我怀疑网络是否具有足够的空间分辨率以区分肿瘤体素和非肿瘤体素。与纯tf实现相比,您能否显示keras实现和性能?我们可能会使用2层以上的图层。
    说有3层,每层跨度为2,输入图像宽度为256,在最深的编码器层的宽度为32。 (如果您的GPU内存有限,请对输入图像进行降采样。)

  • 如果更改损耗计算无效,如@bremen_matt所述,请减小LR表示1e-5。 / li>在基本体系结构调整之后,您觉得网络是一种学习过程,并且没有被卡住,您可以尝试增强训练数据,在训练过程中添加辍学,批处理规范,然后也许喜欢通过添加鉴别符来弥补您的损失。


I am trying to create a simple 3D U-net for image segmentation, just to learn how to use the layers. Therefore I do a 3D convolution with stride 2 and then a transpose deconvolution to get back the same image size. I am also overfitting to a small set (test set) just to see if my network is learning.

I created the same net in Keras and it works just fine. Now I want to create in tensorflow but I been having trouble with it.

The cost changes slightly but no matter what I do (reduce learning rate, add more epochs, add more layers, change batch size...) the output is always the same. I believe the net is not updating the weights. I am sure I am doing something wrong but I can find what it is. Any help would be greatly appreciate it.

Here is my code:

def forward_propagation(X):

    if ( mode == 'train'): print(" --------- Net --------- ")

    # Convolutional Layer 1
    with tf.variable_scope('CONV1'):
        Z1 = tf.layers.conv3d(X, filters = 16, kernel =[3,3,3], strides = [ 2, 2, 2], padding='SAME', name = 'S2/conv3d')
        A1 = tf.nn.relu(Z1, name = 'S2/ReLU')
        if ( mode == 'train'): print("Convolutional Layer 1 S2 " + str(A1.get_shape()))

    # DEConvolutional Layer 1
    with tf.variable_scope('DeCONV1'):
        output_deconv1 = tf.stack([X.get_shape()[0] , X.get_shape()[1], X.get_shape()[2], X.get_shape()[3], 1])
        dZ1 = tf.nn.conv3d_transpose(A1,  filters = 1, kernel =[3,3,3], strides = [2, 2, 2], padding='SAME', name = 'S2/conv3d_transpose')
        dA1 = tf.nn.relu(dZ1, name = 'S2/ReLU')

        if ( mode == 'train'): print("Deconvolutional Layer 1 S1 " + str(dA1.get_shape()))

    return dA1


def compute_cost(output, target, method = 'dice_hard_coe'):

    with tf.variable_scope('COST'):       

        if (method == 'sigmoid_cross_entropy') :
            # Make them vectors
            output = tf.reshape( output, [-1, output.get_shape().as_list()[0]] )
            target = tf.reshape( target, [-1, target.get_shape().as_list()[0]] )
            loss = tf.nn.sigmoid_cross_entropy_with_logits(logits = output, labels = target)
            cost = tf.reduce_mean(loss)

    return cost

and the main function for the model:

def model(X_h5, Y_h5, learning_rate = 0.009,
          num_epochs = 100, minibatch_size = 64, print_cost = True):


    ops.reset_default_graph()                         # to be able to rerun the model without overwriting tf variables
    #tf.set_random_seed(1)                             # to keep results consistent (tensorflow seed)
    #seed = 3                                          # to keep results consistent (numpy seed)
    (m, n_D, n_H, n_W, num_channels) = X_h5["test_data"].shape   #TTT          
    num_labels = Y_h5["test_mask"].shape[4] #TTT
    img_size = Y_h5["test_mask"].shape[1]  #TTT
    costs = []                                        # To keep track of the cost
    accuracies = []                                   # To keep track of the accuracy



    # Create Placeholders of the correct shape
    X, Y = create_placeholders(n_H, n_W, n_D, minibatch_size)

    # Forward propagation: Build the forward propagation in the tensorflow graph
    nn_output = forward_propagation(X)
    prediction = tf.nn.sigmoid(nn_output)

    # Cost function: Add cost function to tensorflow graph
    cost_method = 'sigmoid_cross_entropy' 
    cost = compute_cost(nn_output, Y, cost_method)

    # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer that minimizes the cost.
    optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)

    # Initialize all the variables globally
    init = tf.global_variables_initializer()


    # Start the session to compute the tensorflow graph
    with tf.Session() as sess:

        print('------ Training ------')

        # Run the initialization
        tf.local_variables_initializer().run(session=sess)
        sess.run(init)

        # Do the training loop
        for i in range(num_epochs*m):
            # ----- TRAIN -------
            current_epoch = i//m            

            patient_start = i-(current_epoch * m)
            patient_end = patient_start + minibatch_size

            current_X_train = np.zeros((minibatch_size, n_D,  n_H, n_W,num_channels))
            current_X_train[:,:,:,:,:] = np.array(X_h5["test_data"][patient_start:patient_end,:,:,:,:]) #TTT
            current_X_train = np.nan_to_num(current_X_train) # make nan zero

            current_Y_train = np.zeros((minibatch_size, n_D, n_H, n_W, num_labels))
            current_Y_train[:,:,:,:,:] = np.array(Y_h5["test_mask"][patient_start:patient_end,:,:,:,:]) #TTT
            current_Y_train = np.nan_to_num(current_Y_train) # make nan zero

            feed_dict = {X: current_X_train, Y: current_Y_train}
            _ , temp_cost = sess.run([optimizer, cost], feed_dict=feed_dict)

            # ----- TEST -------
            # Print the cost every 1/5 epoch
            if ((i % (num_epochs*m/5) )== 0):              

                # Calculate the predictions
                test_predictions = np.zeros(Y_h5["test_mask"].shape)

                for j in range(0, X_h5["test_data"].shape[0], minibatch_size):

                    patient_start = j
                    patient_end = patient_start + minibatch_size

                    current_X_test = np.zeros((minibatch_size, n_D,  n_H, n_W, num_channels))
                    current_X_test[:,:,:,:,:] = np.array(X_h5["test_data"][patient_start:patient_end,:,:,:,:])
                    current_X_test = np.nan_to_num(current_X_test) # make nan zero

                    current_Y_test = np.zeros((minibatch_size, n_D, n_H, n_W, num_labels))
                    current_Y_test[:,:,:,:,:] = np.array(Y_h5["test_mask"][patient_start:patient_end,:,:,:,:]) 
                    current_Y_test = np.nan_to_num(current_Y_test) # make nan zero

                    feed_dict = {X: current_X_test, Y: current_Y_test}
                    _, current_prediction = sess.run([cost, prediction], feed_dict=feed_dict)
                    test_predictions[j:j + minibatch_size,:,:,:,:] = current_prediction

                costs.append(temp_cost)
                print ("[" + str(current_epoch) + "|" + str(num_epochs) + "] " + "Cost : " + str(costs[-1]))
                display_progress(X_h5["test_data"], Y_h5["test_mask"], test_predictions, 5, n_H, n_W)

        # plot the cost
        plt.plot(np.squeeze(costs))
        plt.ylabel('cost')
        plt.xlabel('epochs')
        plt.show()

        return  

I call the model with:

model(hdf5_data_file, hdf5_mask_file, num_epochs = 500, minibatch_size = 1, learning_rate = 1e-3)

These are the results that I am currently getting:

Edit: I have tried reducing the learning rate and it doesn't help. I also tried using tensorboard debug and the weights are not being updated:

I am not sure why this is happening. I Created the same simple model in keras and it works fine. I am not sure what I am doing wrong in tensorflow.

解决方案

Not sure if you are still looking for help, as I am answering this question half a year later your posted date. :) I've listed my observations and also some suggestions for you to try below. It my primary observation is right... then you probably just need a coffee break / a night of good sleep.

primary observation:

  • tf.reshape( output, [-1, output.get_shape().as_list()[0]] ) seems wrong. If you prefer to flatten the vector, it should be something like tf.reshape(output,[-1,np.prod(image_shape_list)]).

other observations:

  • With such a shallow network, I doubt the network have enough spatial resolution to differentiate tumor voxels from non-tumor voxels. Can you show the keras implementation and the performance compared to a pure tf implementation? I would probably go with 2+ layers, let's . say with 3 layers, with a stride of 2 per layer, and an input image width of 256, you will end with a width of 32 at your deepest encoder layer. (If you have a limited GPU memory, downsample the input image.)
  • if changing the loss computation does not work, as @bremen_matt mentioned, reduce LR to say maybe 1e-5.
  • after the basic architecture tweaks and you "feel" that the network is sort of learning and not stuck, try augmenting the training data, add dropout, batch norm during training, and then maybe fancy up your loss by adding a discriminator.

这篇关于Tensorflow:简单的3D Convnet无法学习的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆