模型在张量流中不学习 [英] Model not learning in tensorflow

查看:85
本文介绍了模型在张量流中不学习的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是张量流和神经网络的新手,我正在尝试创建一个仅将两个float值相乘的模型.

我不确定我想要多少个神经元,但是我选择了10个神经元,然后试着去看看我可以从那里去.我认为这可能会引入足够的复杂性,以便半准确地学习该操作.

无论如何,这是我的代码:

import tensorflow as tf
import numpy as np

# Teach how to multiply
def generate_data(how_many):
    data = np.random.rand(how_many, 2)
    answers = data[:, 0] * data[:, 1]
    return data, answers


sess = tf.InteractiveSession()

# Input data
input_data = tf.placeholder(tf.float32, shape=[None, 2])
correct_answers = tf.placeholder(tf.float32, shape=[None])

# Use 10 neurons--just one layer for now, but it'll be fully connected
weights_1 = tf.Variable(tf.truncated_normal([2, 10], stddev=.1))
bias_1 = tf.Variable(.1)


# Output of this will be a [None, 10]
hidden_output = tf.nn.relu(tf.matmul(input_data, weights_1) + bias_1)

# Weights
weights_2 = tf.Variable(tf.truncated_normal([10, 1], stddev=.1))

bias_2 = tf.Variable(.1)
# Softmax them together--this will be [None, 1]
calculated_output = tf.nn.softmax(tf.matmul(hidden_output, weights_2) + bias_2)

cross_entropy = tf.reduce_mean(correct_answers * tf.log(calculated_output))

optimizer = tf.train.GradientDescentOptimizer(.5).minimize(cross_entropy)

sess.run(tf.initialize_all_variables())

for i in range(1000):
    x, y = generate_data(100)
    sess.run(optimizer, feed_dict={input_data: x, correct_answers: y})

error = tf.reduce_sum(tf.abs(calculated_output - correct_answers))

x, y = generate_data(100)
print("Total Error: ", error.eval(feed_dict={input_data: x, correct_answers: y}))

似乎错误始终在7522.1左右,对于100个数据点来说非常糟糕,因此我认为这不是学习中的问题.

我的问题:我的机器在学习吗?如果是这样,我该怎么做才能使其更准确?如果没有,我该如何学习?

解决方案

该代码存在一些主要问题.亚伦已经确定了其中的一些,但是还有一个重要的概念:calculated_outputcorrect_answers的形状不同,因此在减去它们时要创建一个2D矩阵. (calculated_output的形状为(100,1),correct_answers的形状为(100).)因此您需要调整形状(例如,通过在calculated_output上使用tf.squeeze).

这个问题实际上也不需要任何非线性,因此您无需激活就可以解决,而只需一层即可.以下代码获得的总误差约为6(每个测试点平均误差约为0.06).希望有帮助!

import tensorflow as tf
import numpy as np


# Teach how to multiply
def generate_data(how_many):
    data = np.random.rand(how_many, 2)
    answers = data[:, 0] * data[:, 1]
    return data, answers


sess = tf.InteractiveSession()

input_data = tf.placeholder(tf.float32, shape=[None, 2])
correct_answers = tf.placeholder(tf.float32, shape=[None])

weights_1 = tf.Variable(tf.truncated_normal([2, 1], stddev=.1))
bias_1 = tf.Variable(.0)

output_layer = tf.matmul(input_data, weights_1) + bias_1

mean_squared = tf.reduce_mean(tf.square(correct_answers - tf.squeeze(output_layer)))
optimizer = tf.train.GradientDescentOptimizer(.1).minimize(mean_squared)

sess.run(tf.initialize_all_variables())

for i in range(1000):
    x, y = generate_data(100)
    sess.run(optimizer, feed_dict={input_data: x, correct_answers: y})

error = tf.reduce_sum(tf.abs(tf.squeeze(output_layer) - correct_answers))

x, y = generate_data(100)
print("Total Error: ", error.eval(feed_dict={input_data: x, correct_answers: y}))

I am new to tensorflow and neural networks, and I am trying to create a model that just multiples two float values together.

I wasn't sure how many neurons I would want, but I picked 10 neurons and tried to see where I could go from that. I figured that would probably introduce enough complexity in order to semi-accurately learn that operation.

Anyways, here is my code:

import tensorflow as tf
import numpy as np

# Teach how to multiply
def generate_data(how_many):
    data = np.random.rand(how_many, 2)
    answers = data[:, 0] * data[:, 1]
    return data, answers


sess = tf.InteractiveSession()

# Input data
input_data = tf.placeholder(tf.float32, shape=[None, 2])
correct_answers = tf.placeholder(tf.float32, shape=[None])

# Use 10 neurons--just one layer for now, but it'll be fully connected
weights_1 = tf.Variable(tf.truncated_normal([2, 10], stddev=.1))
bias_1 = tf.Variable(.1)


# Output of this will be a [None, 10]
hidden_output = tf.nn.relu(tf.matmul(input_data, weights_1) + bias_1)

# Weights
weights_2 = tf.Variable(tf.truncated_normal([10, 1], stddev=.1))

bias_2 = tf.Variable(.1)
# Softmax them together--this will be [None, 1]
calculated_output = tf.nn.softmax(tf.matmul(hidden_output, weights_2) + bias_2)

cross_entropy = tf.reduce_mean(correct_answers * tf.log(calculated_output))

optimizer = tf.train.GradientDescentOptimizer(.5).minimize(cross_entropy)

sess.run(tf.initialize_all_variables())

for i in range(1000):
    x, y = generate_data(100)
    sess.run(optimizer, feed_dict={input_data: x, correct_answers: y})

error = tf.reduce_sum(tf.abs(calculated_output - correct_answers))

x, y = generate_data(100)
print("Total Error: ", error.eval(feed_dict={input_data: x, correct_answers: y}))

It seems that the error is always around 7522.1, which very very bad for just 100 data points, so I assume it is not learning.

My questions: Is my machine learning? If so, what can I do to make it more accurate? If not, how can I make it learn?

解决方案

There are a few major issues with the code. Aaron has already identified some of them, but there's another important one: calculated_output and correct_answers are not the same shape, so you're creating a 2D matrix when you subtract them. (The shape of calculated_output is (100, 1) and the shape of correct_answers is (100).) So you need to adjust the shape (for example, by using tf.squeeze on calculated_output).

This problem also doesn't really require any non-linearities, so you could get by with no activations and only one layer. The following code gets a total error of about 6 (~0.06 error on average for each test point). Hope that helps!

import tensorflow as tf
import numpy as np


# Teach how to multiply
def generate_data(how_many):
    data = np.random.rand(how_many, 2)
    answers = data[:, 0] * data[:, 1]
    return data, answers


sess = tf.InteractiveSession()

input_data = tf.placeholder(tf.float32, shape=[None, 2])
correct_answers = tf.placeholder(tf.float32, shape=[None])

weights_1 = tf.Variable(tf.truncated_normal([2, 1], stddev=.1))
bias_1 = tf.Variable(.0)

output_layer = tf.matmul(input_data, weights_1) + bias_1

mean_squared = tf.reduce_mean(tf.square(correct_answers - tf.squeeze(output_layer)))
optimizer = tf.train.GradientDescentOptimizer(.1).minimize(mean_squared)

sess.run(tf.initialize_all_variables())

for i in range(1000):
    x, y = generate_data(100)
    sess.run(optimizer, feed_dict={input_data: x, correct_answers: y})

error = tf.reduce_sum(tf.abs(tf.squeeze(output_layer) - correct_answers))

x, y = generate_data(100)
print("Total Error: ", error.eval(feed_dict={input_data: x, correct_answers: y}))

这篇关于模型在张量流中不学习的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆