面对“没有任何变量的梯度".训练SIAMESE网络时出错 [英] Facing "No gradients for any variable" Error while training a SIAMESE NETWORK

查看:329
本文介绍了面对“没有任何变量的梯度".训练SIAMESE网络时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在Tensorflow(ver:1.8 os:Ubuntu MATE16.04)平台上构建模型. 该模型的目的是检测/匹配人体关键点. 训练时,出现没有任何变量的梯度"错误,我很难修复它.

模型背景: 它的基本思想来自这两篇论文:

  1. 深度学习二进制哈希码以实现快速图像检索
  2. 通过无监督的深度神经网络学习紧凑型二进制描述符

他们表明可以根据卷积网络生成的哈希码来匹配图像. 两张图片的相似性取决于它们对应的哈希码之间的汉明距离.

我认为有可能开发出一种重量极轻的模型,以对具有恒定人类对象"和固定背景"的视频进行实时人体姿态估计.


模型结构

01.数据源:

一部影片中的

3张图像具有相同的人类主题和相似的背景. 每个图像中的每个人类关键点均已正确标记. 其中2张图像将用作提示源",最后一张图像将成为关键点检测/匹配的目标.

02.提示:

将根据人类关键点的位置从提示源"图像中裁剪出23x23像素的ROI. 这些投资回报率的中心是关键点.

03.用于提示"的卷积网络:

简单的三层结构. 前两层使用3x3滤镜以[2,2]步长进行卷积. 最后一层是5x5输入上的5x5卷积,没有填充(等于完全连接的层)

这会将23x23像素的提示ROI转换为一个32位哈希码. 一个提示源图像将生成一组16个哈希码.

04.用于目标图像"的卷积网络: 网络与提示网络共享smae权重. 但是在这种情况下,每个卷积层都有填充. 301x301像素的图片将变成76x76的哈希图"

05.哈希匹配:

我制作了一个名为"locateMin_and_get_loss"的函数,用于计算提示哈希"和哈希图每个点上的哈希码之间的汉明距离. 此功能将创建一个距离图". 距离值最小的点的位置将被视为关键点的位置.

06.损失计算:

我制作了一个函数"get_total_loss_and_result"来计算16个关键点的总损失. 损失是地面真值标签点与模型所确定的点之间的标准化欧几里得距离.

07.拟议的工作流程:

在初始化此模型之前,用户将从不同角度拍摄目标人类对象的两张图片. 这些图片将使用最先进的模型(例如OpenPose或DeepPose)进行标记,并使用03中提到的卷积网络从它们生成提示哈希.

最后,视频流将由模型启动和处理.

08.为什么要使用两个"提示集?

从不同角度观察到的一个人体关节/关键点将具有非常不同的外观. 我不想增加神经网络的二元性,而是想通过收集两个提示而不是一个来欺骗游戏". 我想知道它是否可以提高模型的精度和泛化能力.


我遇到的问题:

01.任何变量均无梯度"错误 (我对这则帖子的主要疑问):

如上所述,我在训练模型时遇到了这个错误. 我尝试从 this

由于其独特的结构,很难使用传统的占位符来包含多个批次的输入数据. 我通过将批号设置为3来修复它,并手动组合损失函数的值.

2018.10.28

仅设置一个提示的简化版本:

import tensorflow as tf
import numpy as np
import time
from imageLoader import getPaddedROI,training_data_feeder
import math
'''
created by Cid Zhang 
a sub-model for human pose estimation
'''
def truncated_normal_var(name,shape,dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.truncated_normal_initializer(stddev=0.01)))
def zero_var(name,shape,dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.constant_initializer(0.0)))

roi_size = 23
image_input_size = 301

#input placeholders
#batch1 hints
inputs_b1h1 = tf.placeholder(tf.float32, ( 16, roi_size, roi_size, 3), name='inputs_b1h1')

inputs_s = tf.placeholder(tf.float32, (None, image_input_size, image_input_size, 3), name='inputs_s')
labels = tf.placeholder(tf.float32,(16,76,76), name='labels')

#define the model
def paraNet(input):
    out_l1 = tf.layers.conv2d(input, 8, [3, 3],strides=(2, 2), padding ='valid' ,name='para_conv_1')
    out_l1 = tf.nn.relu6(out_l1)
    out_l2 = tf.layers.conv2d(out_l1, 16, [3, 3],strides=(2, 2), padding ='valid' ,name='para_conv_2')
    out_l2 = tf.nn.relu6(out_l2)
    out_l3 = tf.layers.conv2d(out_l2, 32, [5, 5],strides=(1, 1), padding ='valid' ,name='para_conv_3')
    return out_l3

#network pipeline to create the first Hint Hash Sets (Three batches)
with tf.variable_scope('conv'):
    out_b1h1_l3 = paraNet(inputs_b1h1)
    #flatten and binerize the hashs
    out_b1h1_l3 =tf.squeeze(  tf.round(tf.nn.sigmoid(out_b1h1_l3)) )


with tf.variable_scope('conv', reuse=True):
    out_2_l1 = tf.layers.conv2d(inputs_s,  8, [3, 3],strides=(2, 2),     padding ='same' ,name='para_conv_1')
    out_2_l1 = tf.nn.relu6(out_2_l1)
    out_2_l2 = tf.layers.conv2d(out_2_l1, 16, [3, 3],strides=(2, 2), padding ='same' ,name='para_conv_2')
    out_2_l2 = tf.nn.relu6(out_2_l2)
    out_2_l3 = tf.layers.conv2d(out_2_l2, 32, [5, 5],strides=(1, 1), padding ='same' ,name='para_conv_3')
    #binerize the value into Hash code

    out_2_l3 = tf.round(tf.nn.sigmoid(out_2_l3))

    orig_feature_map_size = tf.shape(out_2_l3)[1]

    #calculate Hamming distance maps
    map0 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[0] , out_2_l3 ) ) , axis=3 )  
    map1 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[1] , out_2_l3 ) ) , axis=3 )  
    map2 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[2] , out_2_l3 ) ) , axis=3 )  
    map3 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[3] , out_2_l3 ) ) , axis=3 )  
    map4 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[4] , out_2_l3 ) ) , axis=3 )  
    map5 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[5] , out_2_l3 ) ) , axis=3 )  
    map6 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[6] , out_2_l3 ) ) , axis=3 )  
    map7 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[7] , out_2_l3 ) ) , axis=3 )  
    map8 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[8] , out_2_l3 ) ) , axis=3 )  
    map9 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[9] , out_2_l3 ) ) , axis=3 )  
    map10 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[10] , out_2_l3 ) ) , axis=3 )  
    map11 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[11] , out_2_l3 ) ) , axis=3 )  
    map12 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[12] , out_2_l3 ) ) , axis=3 )  
    map13 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[13] , out_2_l3 ) ) , axis=3 )  
    map14 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[14] , out_2_l3 ) ) , axis=3 )  
    map15 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[15] , out_2_l3 ) ) , axis=3 )  

    totoal_map =tf.div( tf.concat([map0, map1, map2, map3, map4, map5, map6, map7,
                               map8, map9, map10,map11,map12, map13, map14, map15], 0) , 32)
    loss = tf.nn.l2_loss(totoal_map - labels  , name = 'loss'  )

#ValueError: No gradients provided for any variable, check your graph     for ops that do not support gradients, between variables 
    train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss )


init =  tf.global_variables_initializer()
batchsize = 3

with tf.Session() as sess:
#writer = tf.summary.FileWriter("./variable_graph",graph = sess.graph)
sess.run(init)

#load image from dataset(train set)
joint_data_path = "./custom_data.json"
train_val_path = "./train_val_indices.json"
imgpath = "./000/"
input_size = 301
hint_roi_size = 23

hintSet01_norm_batch = []
hintSet02_norm_batch = []
t_img_batch = []
t_label_norm_batch = []
#load data
hintSet01,hintSet02,t_img,t_label_norm = training_data_feeder(joint_data_path, train_val_path, imgpath, input_size, hint_roi_size )
#Normalize the image pixel values to 0~1
hintSet01_norm = []
hintSet02_norm = []

t_img = np.float32(t_img /255.0)

for rois in hintSet01:
    tmp = np.float32(rois / 255.0)
    hintSet01_norm.append(tmp.tolist())
for rois in hintSet02:
    tmp = np.float32(rois / 255.0)
    hintSet02_norm.append(tmp.tolist())

print(tf.trainable_variables())

temp = sess.run(totoal_map , feed_dict={inputs_s:  [t_img]  , 
                                    inputs_b1h1: hintSet01_norm, 
                                    labels: t_label_norm 
                                                   })
print(temp)
print(np.shape(temp))

代码: https://github.com/gitpharm01/Parapose/blob/master/paraposeNetworkV3.py

Tensorflow图: https://github.com/gitpharm01/Parapose/blob/master/000/readme.md

解决方案

我使用了 https://www.tensorflow.org/guide/eager 检查渐变.

最后,我发现"tf.round"和"tf.nn.relu6"将擦除或将渐变设置为零.

我对代码进行了一些修改,现在我可以进入训练阶段:

import tensorflow as tf
import numpy as np
import time
from imageLoader import getPaddedROI,training_data_feeder
import math
import cv2
'''
created by Cid Zhang 
a sub-model for human pose estimation
'''
tf.reset_default_graph()

def truncated_normal_var(name,shape,dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.truncated_normal_initializer(stddev=0.01)))
def zero_var(name,shape,dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.constant_initializer(0.0)))

roi_size = 23
image_input_size = 301

#input placeholders
#batch1 hints
inputs_b1h1 = tf.placeholder(tf.float32, ( 16, roi_size, roi_size, 3), name='inputs_b1h1')
#inputs_b1h2 = tf.placeholder(tf.float32, ( 16, roi_size, roi_size, 3), name='inputs_b1h2')


inputs_s = tf.placeholder(tf.float32, (None, image_input_size, image_input_size, 3), name='inputs_s')
labels = tf.placeholder(tf.float32,(16,76,76), name='labels')

#define the model

def paraNet(inputs, inputs_s):
    with tf.variable_scope('conv'):
        out_l1 = tf.layers.conv2d(inputs, 16, [3, 3],strides=(2, 2), padding ='valid' ,name='para_conv_1')
        out_l1r = tf.nn.relu(out_l1)
        out_l2 = tf.layers.conv2d(out_l1r, 48, [3, 3],strides=(2, 2), padding ='valid' ,name='para_conv_2')
        out_l2r = tf.nn.relu(out_l2)
        out_l3 = tf.layers.conv2d(out_l2r, 96, [5, 5],strides=(1, 1), padding ='valid' ,name='para_conv_3')
        out_l3r = tf.nn.relu(out_l3)
        out_l4 = tf.layers.conv2d(out_l3r, 32, [1, 1],strides=(1, 1), padding ='valid' ,name='para_conv_4')
        out_l4r = tf.squeeze(  tf.sign( tf.sigmoid(out_l4) ) )

    with tf.variable_scope('conv', reuse=True):
        out_2_l1 = tf.layers.conv2d(inputs_s,  16, [3, 3],strides=(2, 2), padding ='same' ,name='para_conv_1')
        out_2_l1r = tf.nn.relu(out_2_l1)
        out_2_l2 = tf.layers.conv2d(out_2_l1r, 48, [3, 3],strides=(2, 2), padding ='same' ,name='para_conv_2')
        out_2_l2r = tf.nn.relu(out_2_l2)
        out_2_l3 = tf.layers.conv2d(out_2_l2r, 96, [5, 5],strides=(1, 1), padding ='same' ,name='para_conv_3')
        out_2_l3r = tf.nn.relu(out_2_l3)
        out_2_l4 = tf.layers.conv2d(out_2_l3r, 32, [1, 1],strides=(1, 1), padding ='same' ,name='para_conv_4')
        out_2_l4r =tf.sign( tf.sigmoid(out_2_l4))
    return out_l4r , out_2_l4r  

def lossFunc(inputs_hint, inputs_sample, labels):    
    hint, sample = paraNet(inputs_hint, inputs_sample)

    map0 = tf.reduce_sum ( tf.abs (tf.subtract( hint[0] , sample ) ) , axis=3 )  
    map1 = tf.reduce_sum ( tf.abs (tf.subtract( hint[1] , sample ) ) , axis=3 )  
    map2 = tf.reduce_sum ( tf.abs (tf.subtract( hint[2] , sample ) ) , axis=3 )  
    map3 = tf.reduce_sum ( tf.abs (tf.subtract( hint[3] , sample ) ) , axis=3 )  
    map4 = tf.reduce_sum ( tf.abs (tf.subtract( hint[4] , sample ) ) , axis=3 )  
    map5 = tf.reduce_sum ( tf.abs (tf.subtract( hint[5] , sample ) ) , axis=3 )  
    map6 = tf.reduce_sum ( tf.abs (tf.subtract( hint[6] , sample ) ) , axis=3 )  
    map7 = tf.reduce_sum ( tf.abs (tf.subtract( hint[7] , sample ) ) , axis=3 )  
    map8 = tf.reduce_sum ( tf.abs (tf.subtract( hint[8] , sample ) ) , axis=3 )  
    map9 = tf.reduce_sum ( tf.abs (tf.subtract( hint[9] , sample ) ) , axis=3 )  
    map10 = tf.reduce_sum ( tf.abs (tf.subtract( hint[10] , sample ) ) , axis=3 )  
    map11 = tf.reduce_sum ( tf.abs (tf.subtract( hint[11] , sample ) ) , axis=3 )  
    map12 = tf.reduce_sum ( tf.abs (tf.subtract( hint[12] , sample ) ) , axis=3 )  
    map13 = tf.reduce_sum ( tf.abs (tf.subtract( hint[13] , sample ) ) , axis=3 )  
    map14 = tf.reduce_sum ( tf.abs (tf.subtract( hint[14] , sample ) ) , axis=3 )  
    map15 = tf.reduce_sum ( tf.abs (tf.subtract( hint[15] , sample ) ) , axis=3 )  

    totoal_map =tf.div( tf.concat([map0, map1, map2, map3, map4, map5, map6, map7,
                               map8, map9, map10,map11,map12, map13, map14, map15], 0) , 64)
    loss = tf.nn.l2_loss( totoal_map -  labels , name = 'loss'  )
    return loss, totoal_map

loss, totoal_map = lossFunc(inputs_b1h1, inputs_s, labels)
train_step = tf.train.GradientDescentOptimizer(2.0).minimize(loss)

#init =  tf.global_variables_initializer()

saver = tf.train.Saver()

with tf.Session() as sess:
    #writer = tf.summary.FileWriter("./variable_graph",graph = sess.graph)
    #sess.run(init)

    #load image from dataset(train set)
    joint_data_path = "./custom_data.json"
    train_val_path = "./train_val_indices.json"
    imgpath = "./000/"
    input_size = 301
    hint_roi_size = 23
    '''
    #load data
    hintSet01,hintSet02,t_img,t_label_norm = training_data_feeder(joint_data_path,     train_val_path, imgpath, input_size, hint_roi_size )
    #Normalize the image pixel values to 0~1
    hintSet01_norm = []
    hintSet02_norm = []

    t_img =[ np.float32(t_img /255.0) ]
    #print(type(t_img))
    #print(np.shape(t_img))
    #print(type(t_label_norm))
    for rois in hintSet01:
        tmp = np.float32(rois / 255.0)
        hintSet01_norm.append(tmp.tolist())
    for rois in hintSet02:
        tmp = np.float32(rois / 255.0)
        hintSet02_norm.append(tmp.tolist())

    loss_value , total_map_value = sess.run ([loss, totoal_map], feed_dict = {inputs_s:  t_img, 
                                                                                                  inputs_b1h1: hintSet01_norm, 
                                                                          labels:     t_label_norm
                                                                          })
    print("-----loss value:",loss_value)
    print("-----total_map_value:", total_map_value[0,0] )
    print("-----label_value", t_label_norm[0,0] )
    #cv2.imshow("t_img",t_img[0])
    #for img in t_label_norm:
    #    print(img)
    #    cv2.imshow("hint", img)
    #    cv2.waitKey(0)

    #print(tf.trainable_variables())
    #print(hash_set01)
    #print(out_2_l3)
    '''
    saver.restore(sess, "./temp_model/model4.ckpt")


    for i in range(1000):

        #load data
        hintSet01,hintSet02,t_img,t_label_norm = training_data_feeder(joint_data_path, train_val_path, imgpath, input_size, hint_roi_size )
        #Normalize the image pixel values to 0~1
        hintSet01_norm = []
        hintSet02_norm = []

        t_img =[ np.float32(t_img /255.0) ]
        #print(type(t_img))
        #print(np.shape(t_img))
        #print(type(t_label_norm))
        for rois in hintSet01:
            tmp = np.float32(rois / 255.0)
            hintSet01_norm.append(tmp.tolist())
        for rois in hintSet02:
            tmp = np.float32(rois / 255.0)
            hintSet02_norm.append(tmp.tolist())
        loss_val, _ = sess.run([loss, train_step] , 
                      feed_dict = {inputs_s:  t_img, 
                                   inputs_b1h1: hintSet01_norm, 
                                   labels: t_label_norm })
        if i % 50 == 0:
            print(loss_val)

    save_path = saver.save(sess, "./temp_model/model" + '5' + ".ckpt")
    #print(temp)
    #print(np.shape(temp))

但是不幸的是,训练期间的损失值并未减少.

我认为代码中仍然存在一些错误. 无论我设置了多长时间的迭代,保存的检查点文件始终被命名为"XXXX.ckpt.data-00000-of-00001".

由于这篇文章的主要问题已经解决,我将再发表一篇文章.

I'm currently building a model on Tensorflow( ver:1.8 os:Ubuntu MATE16.04) platform. The model's purpose is to detect/match Keypoints of human body. While training, the error "No gradients for any variable" occurred, and I have difficulties to fix it.

Background of the model: Its basic ideas came from these two papers:

  1. Deep Learning of Binary Hash Codes for fast Image Retrieval
  2. Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks

They showed it's possible to match images according to Hash codes generated from a convolutional network. The similarity of two pictures is determined by the Hamming distance between their corresponding hash codes.

I think it's possible to develop a extremely light weight model to perform real-time human pose estimation on a video with "constant human subject" and "fixed background".


Model Structure

01.Data source:

3 images from one video with the same human subject and similar background. Every human keypoints in each image are well labeled. 2 of the images will be used as the "hint sources" and the last image will be the target for keypoint detection/matching.

02.Hints:

23x23pixel ROIs will be cropped from the "hint source" images according to the location of human keypoints. The center of these ROIs are the keypoints.

03.convolutional network "for Hints":

A simple 3-layered structure. The first two layers are convolution by [2,2] stride with a 3x3 filter. The last layer is a 5x5 convolution on a 5x5 input with no padding(equals to a fully connected layer)

This will turn a 23x23pixel Hint ROI into one 32 bit Hash codes. One hint souce image will generate a set of 16 Hash codes.

04.Convolutional network "for target image": The network share the smae weights with the hint network. But in this case, each convolution layer have paddings. The 301x301pixel image will be turned into a 76x76 "Hash map"

05.Hash matching:

I made a function called " locateMin_and_get_loss " to calculate the Hamming distance between "hint hash" and the hash codes on each point of the hash map. This function will create a "distance map". he location of the point with lowest distance value will be treated as the location of the keypoint.

06.Loss calculation:

I made a function "get_total_loss_and_result" to calculate the total loss of 16 keypoints. The loss are normalized euclidean distance between ground truth label points and the points located by the model.

07.proposed work flow:

Before initializing this model, the user will take two pictures of the target human subject from different angles. The pictures will be labeled by the state of art models like OpenPose or DeepPose and generate Hint Hashs from them with convolution network mentioned in 03.

Finally the video stream will be started and processd by the model.

08.Why "Two" sets of hints?

One human joint/keypoint observed from different angles will have very diferent appearance. Instead of increasing dimetionality of the neural networ, I want to "cheat the game" by gathering two hints instead of one. I want to know whether it can increase the precision and generalizational capacity of the model or not.


The problems I faced:

01.The "No gradients for any variable " error (My main question of this post):

As mentioned above, I'm facing this error while training the model. I tried to learn from posts like this and this and this. But currently I have no clue even though I checked the computational graph.

02.The "Batch" problem:

Due to its unique structure, it's hard to use conventional placeholder to contain the input data of multiple batch. I fixed it by setting the batch number to 3 and manually combine the value of loss functions.

2018.10.28 Edit:

The simplified version with only one hint set:

import tensorflow as tf
import numpy as np
import time
from imageLoader import getPaddedROI,training_data_feeder
import math
'''
created by Cid Zhang 
a sub-model for human pose estimation
'''
def truncated_normal_var(name,shape,dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.truncated_normal_initializer(stddev=0.01)))
def zero_var(name,shape,dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.constant_initializer(0.0)))

roi_size = 23
image_input_size = 301

#input placeholders
#batch1 hints
inputs_b1h1 = tf.placeholder(tf.float32, ( 16, roi_size, roi_size, 3), name='inputs_b1h1')

inputs_s = tf.placeholder(tf.float32, (None, image_input_size, image_input_size, 3), name='inputs_s')
labels = tf.placeholder(tf.float32,(16,76,76), name='labels')

#define the model
def paraNet(input):
    out_l1 = tf.layers.conv2d(input, 8, [3, 3],strides=(2, 2), padding ='valid' ,name='para_conv_1')
    out_l1 = tf.nn.relu6(out_l1)
    out_l2 = tf.layers.conv2d(out_l1, 16, [3, 3],strides=(2, 2), padding ='valid' ,name='para_conv_2')
    out_l2 = tf.nn.relu6(out_l2)
    out_l3 = tf.layers.conv2d(out_l2, 32, [5, 5],strides=(1, 1), padding ='valid' ,name='para_conv_3')
    return out_l3

#network pipeline to create the first Hint Hash Sets (Three batches)
with tf.variable_scope('conv'):
    out_b1h1_l3 = paraNet(inputs_b1h1)
    #flatten and binerize the hashs
    out_b1h1_l3 =tf.squeeze(  tf.round(tf.nn.sigmoid(out_b1h1_l3)) )


with tf.variable_scope('conv', reuse=True):
    out_2_l1 = tf.layers.conv2d(inputs_s,  8, [3, 3],strides=(2, 2),     padding ='same' ,name='para_conv_1')
    out_2_l1 = tf.nn.relu6(out_2_l1)
    out_2_l2 = tf.layers.conv2d(out_2_l1, 16, [3, 3],strides=(2, 2), padding ='same' ,name='para_conv_2')
    out_2_l2 = tf.nn.relu6(out_2_l2)
    out_2_l3 = tf.layers.conv2d(out_2_l2, 32, [5, 5],strides=(1, 1), padding ='same' ,name='para_conv_3')
    #binerize the value into Hash code

    out_2_l3 = tf.round(tf.nn.sigmoid(out_2_l3))

    orig_feature_map_size = tf.shape(out_2_l3)[1]

    #calculate Hamming distance maps
    map0 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[0] , out_2_l3 ) ) , axis=3 )  
    map1 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[1] , out_2_l3 ) ) , axis=3 )  
    map2 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[2] , out_2_l3 ) ) , axis=3 )  
    map3 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[3] , out_2_l3 ) ) , axis=3 )  
    map4 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[4] , out_2_l3 ) ) , axis=3 )  
    map5 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[5] , out_2_l3 ) ) , axis=3 )  
    map6 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[6] , out_2_l3 ) ) , axis=3 )  
    map7 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[7] , out_2_l3 ) ) , axis=3 )  
    map8 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[8] , out_2_l3 ) ) , axis=3 )  
    map9 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[9] , out_2_l3 ) ) , axis=3 )  
    map10 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[10] , out_2_l3 ) ) , axis=3 )  
    map11 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[11] , out_2_l3 ) ) , axis=3 )  
    map12 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[12] , out_2_l3 ) ) , axis=3 )  
    map13 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[13] , out_2_l3 ) ) , axis=3 )  
    map14 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[14] , out_2_l3 ) ) , axis=3 )  
    map15 = tf.reduce_sum ( tf.abs (tf.subtract( out_b1h1_l3[15] , out_2_l3 ) ) , axis=3 )  

    totoal_map =tf.div( tf.concat([map0, map1, map2, map3, map4, map5, map6, map7,
                               map8, map9, map10,map11,map12, map13, map14, map15], 0) , 32)
    loss = tf.nn.l2_loss(totoal_map - labels  , name = 'loss'  )

#ValueError: No gradients provided for any variable, check your graph     for ops that do not support gradients, between variables 
    train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss )


init =  tf.global_variables_initializer()
batchsize = 3

with tf.Session() as sess:
#writer = tf.summary.FileWriter("./variable_graph",graph = sess.graph)
sess.run(init)

#load image from dataset(train set)
joint_data_path = "./custom_data.json"
train_val_path = "./train_val_indices.json"
imgpath = "./000/"
input_size = 301
hint_roi_size = 23

hintSet01_norm_batch = []
hintSet02_norm_batch = []
t_img_batch = []
t_label_norm_batch = []
#load data
hintSet01,hintSet02,t_img,t_label_norm = training_data_feeder(joint_data_path, train_val_path, imgpath, input_size, hint_roi_size )
#Normalize the image pixel values to 0~1
hintSet01_norm = []
hintSet02_norm = []

t_img = np.float32(t_img /255.0)

for rois in hintSet01:
    tmp = np.float32(rois / 255.0)
    hintSet01_norm.append(tmp.tolist())
for rois in hintSet02:
    tmp = np.float32(rois / 255.0)
    hintSet02_norm.append(tmp.tolist())

print(tf.trainable_variables())

temp = sess.run(totoal_map , feed_dict={inputs_s:  [t_img]  , 
                                    inputs_b1h1: hintSet01_norm, 
                                    labels: t_label_norm 
                                                   })
print(temp)
print(np.shape(temp))

The code: https://github.com/gitpharm01/Parapose/blob/master/paraposeNetworkV3.py

The Tensorflow graph: https://github.com/gitpharm01/Parapose/blob/master/variable_graph/events.out.tfevents.1540296979.pharmboy-K30AD-M31AD-M51AD

The Dataset:

It's a custom dataset generated from mpii dataset. It have 223 clusters of images. Each cluster have one constant human subject in various poses and the background remains the same. One cluster have at least 3 pictures. It's about 627MB and I'll try to pack it and upload it later.

2018.10.26 Edit:

You can download it on GoogleDrive, the whole data set was divided into 9 parts.( I can't post more than 8 links in this article. The links are in this file: https://github.com/gitpharm01/Parapose/blob/master/000/readme.md

解决方案

I used "eager execution" described in https://www.tensorflow.org/guide/eager to check the gradient.

In the end I found "tf.round" and "tf.nn.relu6" will erase or set the gradient to zero.

I made some modification to the code and now I can enter the training phase:

import tensorflow as tf
import numpy as np
import time
from imageLoader import getPaddedROI,training_data_feeder
import math
import cv2
'''
created by Cid Zhang 
a sub-model for human pose estimation
'''
tf.reset_default_graph()

def truncated_normal_var(name,shape,dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.truncated_normal_initializer(stddev=0.01)))
def zero_var(name,shape,dtype):
    return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.constant_initializer(0.0)))

roi_size = 23
image_input_size = 301

#input placeholders
#batch1 hints
inputs_b1h1 = tf.placeholder(tf.float32, ( 16, roi_size, roi_size, 3), name='inputs_b1h1')
#inputs_b1h2 = tf.placeholder(tf.float32, ( 16, roi_size, roi_size, 3), name='inputs_b1h2')


inputs_s = tf.placeholder(tf.float32, (None, image_input_size, image_input_size, 3), name='inputs_s')
labels = tf.placeholder(tf.float32,(16,76,76), name='labels')

#define the model

def paraNet(inputs, inputs_s):
    with tf.variable_scope('conv'):
        out_l1 = tf.layers.conv2d(inputs, 16, [3, 3],strides=(2, 2), padding ='valid' ,name='para_conv_1')
        out_l1r = tf.nn.relu(out_l1)
        out_l2 = tf.layers.conv2d(out_l1r, 48, [3, 3],strides=(2, 2), padding ='valid' ,name='para_conv_2')
        out_l2r = tf.nn.relu(out_l2)
        out_l3 = tf.layers.conv2d(out_l2r, 96, [5, 5],strides=(1, 1), padding ='valid' ,name='para_conv_3')
        out_l3r = tf.nn.relu(out_l3)
        out_l4 = tf.layers.conv2d(out_l3r, 32, [1, 1],strides=(1, 1), padding ='valid' ,name='para_conv_4')
        out_l4r = tf.squeeze(  tf.sign( tf.sigmoid(out_l4) ) )

    with tf.variable_scope('conv', reuse=True):
        out_2_l1 = tf.layers.conv2d(inputs_s,  16, [3, 3],strides=(2, 2), padding ='same' ,name='para_conv_1')
        out_2_l1r = tf.nn.relu(out_2_l1)
        out_2_l2 = tf.layers.conv2d(out_2_l1r, 48, [3, 3],strides=(2, 2), padding ='same' ,name='para_conv_2')
        out_2_l2r = tf.nn.relu(out_2_l2)
        out_2_l3 = tf.layers.conv2d(out_2_l2r, 96, [5, 5],strides=(1, 1), padding ='same' ,name='para_conv_3')
        out_2_l3r = tf.nn.relu(out_2_l3)
        out_2_l4 = tf.layers.conv2d(out_2_l3r, 32, [1, 1],strides=(1, 1), padding ='same' ,name='para_conv_4')
        out_2_l4r =tf.sign( tf.sigmoid(out_2_l4))
    return out_l4r , out_2_l4r  

def lossFunc(inputs_hint, inputs_sample, labels):    
    hint, sample = paraNet(inputs_hint, inputs_sample)

    map0 = tf.reduce_sum ( tf.abs (tf.subtract( hint[0] , sample ) ) , axis=3 )  
    map1 = tf.reduce_sum ( tf.abs (tf.subtract( hint[1] , sample ) ) , axis=3 )  
    map2 = tf.reduce_sum ( tf.abs (tf.subtract( hint[2] , sample ) ) , axis=3 )  
    map3 = tf.reduce_sum ( tf.abs (tf.subtract( hint[3] , sample ) ) , axis=3 )  
    map4 = tf.reduce_sum ( tf.abs (tf.subtract( hint[4] , sample ) ) , axis=3 )  
    map5 = tf.reduce_sum ( tf.abs (tf.subtract( hint[5] , sample ) ) , axis=3 )  
    map6 = tf.reduce_sum ( tf.abs (tf.subtract( hint[6] , sample ) ) , axis=3 )  
    map7 = tf.reduce_sum ( tf.abs (tf.subtract( hint[7] , sample ) ) , axis=3 )  
    map8 = tf.reduce_sum ( tf.abs (tf.subtract( hint[8] , sample ) ) , axis=3 )  
    map9 = tf.reduce_sum ( tf.abs (tf.subtract( hint[9] , sample ) ) , axis=3 )  
    map10 = tf.reduce_sum ( tf.abs (tf.subtract( hint[10] , sample ) ) , axis=3 )  
    map11 = tf.reduce_sum ( tf.abs (tf.subtract( hint[11] , sample ) ) , axis=3 )  
    map12 = tf.reduce_sum ( tf.abs (tf.subtract( hint[12] , sample ) ) , axis=3 )  
    map13 = tf.reduce_sum ( tf.abs (tf.subtract( hint[13] , sample ) ) , axis=3 )  
    map14 = tf.reduce_sum ( tf.abs (tf.subtract( hint[14] , sample ) ) , axis=3 )  
    map15 = tf.reduce_sum ( tf.abs (tf.subtract( hint[15] , sample ) ) , axis=3 )  

    totoal_map =tf.div( tf.concat([map0, map1, map2, map3, map4, map5, map6, map7,
                               map8, map9, map10,map11,map12, map13, map14, map15], 0) , 64)
    loss = tf.nn.l2_loss( totoal_map -  labels , name = 'loss'  )
    return loss, totoal_map

loss, totoal_map = lossFunc(inputs_b1h1, inputs_s, labels)
train_step = tf.train.GradientDescentOptimizer(2.0).minimize(loss)

#init =  tf.global_variables_initializer()

saver = tf.train.Saver()

with tf.Session() as sess:
    #writer = tf.summary.FileWriter("./variable_graph",graph = sess.graph)
    #sess.run(init)

    #load image from dataset(train set)
    joint_data_path = "./custom_data.json"
    train_val_path = "./train_val_indices.json"
    imgpath = "./000/"
    input_size = 301
    hint_roi_size = 23
    '''
    #load data
    hintSet01,hintSet02,t_img,t_label_norm = training_data_feeder(joint_data_path,     train_val_path, imgpath, input_size, hint_roi_size )
    #Normalize the image pixel values to 0~1
    hintSet01_norm = []
    hintSet02_norm = []

    t_img =[ np.float32(t_img /255.0) ]
    #print(type(t_img))
    #print(np.shape(t_img))
    #print(type(t_label_norm))
    for rois in hintSet01:
        tmp = np.float32(rois / 255.0)
        hintSet01_norm.append(tmp.tolist())
    for rois in hintSet02:
        tmp = np.float32(rois / 255.0)
        hintSet02_norm.append(tmp.tolist())

    loss_value , total_map_value = sess.run ([loss, totoal_map], feed_dict = {inputs_s:  t_img, 
                                                                                                  inputs_b1h1: hintSet01_norm, 
                                                                          labels:     t_label_norm
                                                                          })
    print("-----loss value:",loss_value)
    print("-----total_map_value:", total_map_value[0,0] )
    print("-----label_value", t_label_norm[0,0] )
    #cv2.imshow("t_img",t_img[0])
    #for img in t_label_norm:
    #    print(img)
    #    cv2.imshow("hint", img)
    #    cv2.waitKey(0)

    #print(tf.trainable_variables())
    #print(hash_set01)
    #print(out_2_l3)
    '''
    saver.restore(sess, "./temp_model/model4.ckpt")


    for i in range(1000):

        #load data
        hintSet01,hintSet02,t_img,t_label_norm = training_data_feeder(joint_data_path, train_val_path, imgpath, input_size, hint_roi_size )
        #Normalize the image pixel values to 0~1
        hintSet01_norm = []
        hintSet02_norm = []

        t_img =[ np.float32(t_img /255.0) ]
        #print(type(t_img))
        #print(np.shape(t_img))
        #print(type(t_label_norm))
        for rois in hintSet01:
            tmp = np.float32(rois / 255.0)
            hintSet01_norm.append(tmp.tolist())
        for rois in hintSet02:
            tmp = np.float32(rois / 255.0)
            hintSet02_norm.append(tmp.tolist())
        loss_val, _ = sess.run([loss, train_step] , 
                      feed_dict = {inputs_s:  t_img, 
                                   inputs_b1h1: hintSet01_norm, 
                                   labels: t_label_norm })
        if i % 50 == 0:
            print(loss_val)

    save_path = saver.save(sess, "./temp_model/model" + '5' + ".ckpt")
    #print(temp)
    #print(np.shape(temp))

But unfortunately the loss value was not decreasing during the training.

I think there are still some bugs in the code. The saved check point file are always named "XXXX.ckpt.data-00000-of-00001" no matter how long the iteration I set.

I'll make another post about it since the main problem of this post is solved.

这篇关于面对“没有任何变量的梯度".训练SIAMESE网络时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆