经过训练的TensorFlow模型始终输出零 [英] Trained TensorFlow model always outputs zero
问题描述
我正在TensorFlow中训练自动驾驶卷积神经网络.这是一个简单的回归网络,可以拍摄图像并输出单个值(转向角).
I am training an autonomous driving convolutional neural network in TensorFlow. It is a simple regression network that takes an image and outputs a single value (a steering angle).
这是定义网络的功能:
def cnn_model_fn(features, labels, mode):
conv1 = tf.layers.conv2d(
inputs=features,
filters=32,
kernel_size=5,
padding="same",
activation=tf.nn.relu
)
pool1 = tf.layers.max_pooling2d(
inputs=conv1,
pool_size=2,
strides=2
)
pool1_flat = tf.reshape(pool1, [-1, 2764800])
dense1 = tf.layers.dense(
inputs=pool1_flat,
units=128,
activation=tf.nn.relu
)
dropout = tf.layers.dropout(
inputs=dense1,
rate=0.4,
training=mode == learn.ModeKeys.TRAIN
)
dense2 = tf.layers.dense(
inputs=dropout,
units=1,
activation=tf.nn.relu
)
predictions = tf.reshape(dense2, [-1])
loss = None
train_op = None
if mode != learn.ModeKeys.INFER:
loss = tf.losses.mean_squared_error(
labels=labels,
predictions=predictions
)
if mode == learn.ModeKeys.TRAIN:
train_op = tf.contrib.layers.optimize_loss(
loss=loss,
global_step=tf.contrib.framework.get_global_step(),
learning_rate=0.001,
optimizer="SGD"
)
return model_fn_lib.ModelFnOps(
mode=mode,
predictions=predictions,
loss=loss,
train_op=train_op
)
在程序的其他地方,我像这样发起分类器的训练:
Elsewhere in the program, I initiate the classifier's training like so:
def main(_):
# Gather data
images, labels = get_data("./data/labels.csv")
# Create the estimator
classifier = learn.Estimator(
model_fn=cnn_model_fn,
model_dir="/tmp/network2"
)
# Train the model
classifier.fit(
x=images,
y=labels,
batch_size=10,
steps=20
)
for v in tf.trainable_variables():
print(v)
labels
是一个简单的一维NumPy,其中包含训练示例的所有转向角.正在从CSV文件读取它们.该文件中的值非常接近0,并且平均值大约为零.
labels
is a simple one-dimensional NumPy containing all of the steering angles for the training examples. They are being read from a CSV file. The values in the file are quite close to 0, and have an average around zero.
直接从文件中读取或乘以标量后,网络将很好地收敛,并实现低损耗功能.当我添加一个常数时,它不能收敛或发散.我怀疑网络的所有权重都收敛于零.
When they are read directly from the file, or multiplied by a scalar, the network converges reasonably well, and achieves a low loss function. When I add a constant, it fails to converge or diverges. I have a suspicion that all of the weights of the network are converging on zero.
有人在我的方法论上看到任何问题吗?
Does anybody see any problems with my methodology?
推荐答案
该辍学正则化可能是罪魁祸首:
That dropout regularization might be the culprit:
dropout = tf.layers.dropout(
inputs=dense1,
rate=0.4,
training=mode == learn.ModeKeys.TRAIN
)
您所描述的是,权重未能充分收敛或接近零,这是对高偏差问题的高度描述.消除或减少正则化程度,向网络添加更多参数或增加方差是解决此类问题的常用方法.
What you are describing, the weights failing to adequately converge or falling close to zero, is highly descriptive of a high-bias problem. Removing or reducing the degree of regularization, adding more parameters to your network, or otherwise increasing the variance are common ways to fix this type of problem.
这篇关于经过训练的TensorFlow模型始终输出零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!