Tensorflow:为什么我的代码运行越来越慢? [英] Tensorflow: Why my code is running slower and slower?

查看:60
本文介绍了Tensorflow:为什么我的代码运行越来越慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 tensorflow 的新手.下面的代码可以成功运行,没有任何错误.在前10行输出中,计算速度很快,输出(定义在最后一行)逐行飞翔.但是,随着迭代的增加,计算变得越来越慢,最终变得难以忍受.所以我想知道是否有任何修改可以加快速度.

I am new to tensorflow. The following code can run successfully, without any error. In the first 10 lines of output, the computation is fast, and the output (defined in the last line) flies line by line. However, as the iteration goes up, the computation become slower and slower, and finally become intolerable. So I wonder whether there are any modifications that can speed this up.

下面是这段代码的简要说明:此代码将单个隐藏层神经网络应用于数据集.它旨在找到 rate[0] 和 rate[1] 的最佳参数,这些参数会影响损失函数.在训练的每一步中,都会将一个元组输入模型,并立即评估元组的准确性(这种数据在现实世界中以流的形式出现).

Here is a brief description of this code: This code apply the single hidden-layer neural network to the dataset. It aims to find the best parameter for rate[0] and rate[1], which are parameters that will effect the loss function. During each step of training, one tuple is fed to the model, and the accuracy of the tuple is immediately evaluated (this kind of data comes as a stream in real world).

import tensorflow as tf
import numpy as np

n_hidden=50
n_input=37
n_output=2
data_raw=np.genfromtxt(r'data.csv',delimiter=",",dtype=None)
data_info=np.genfromtxt(r'data2.csv',delimiter=",",dtype=None)

def pre_process( tuple):
    ans = []
    temp = [0 for i in range(24)]
    temp[int(tuple[0])] = 1
    # np.append(ans,np.array(temp))
    ans.extend(temp)
    temp = [0 for i in range(7)]
    temp[int(tuple[1]) - 1] = 1
    ans.extend(temp)
    # np.append(ans,np.array(temp))
    temp = [0 for i in range(3)]
    temp[int(tuple[3])] = 1
    ans.extend(temp)
    temp = [0 for i in range(2)]
    temp[int(tuple[4])] = 1
    ans.extend(temp)
    ans.extend([int(tuple[5])])
    return np.array(ans)

x=tf.placeholder(tf.float32, shape=[1,n_input])
y_=tf.placeholder(tf.float32,shape=[n_output])
y_r=tf.placeholder(tf.float32,shape=[n_output])
W1=tf.Variable(tf.random_uniform([n_input, n_hidden]))
b1=tf.Variable(tf.zeros([n_hidden]))
W2=tf.Variable(tf.zeros([n_hidden,n_output]))
b2=tf.Variable(tf.zeros([n_output]))

logits_1 = tf.matmul(x, W1) + b1
relu_layer= tf.nn.relu(logits_1)
logits_2 = tf.matmul(relu_layer, W2) + b2

correct_prediction = tf.equal(tf.argmax(logits_2,1), tf.argmax(y_,0))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

rate=[0,0]
for i in range(-100,200,10):
    rate[0]=i;
    for j in range(-100,i,10):
        rate[1]=j
        loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=logits_2)*[rate[0],rate[1]])
#       loss2=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_r, logits=logits_2)*[rate[2],rate[3]])
#       loss=loss1+loss2
        train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
        data_line=1

        accur=0
        local_local=0
        remote_remote=0
        local_remote=0
        remote_local=0
        total=0
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            for i in range(200):
#               print(int(data_raw[data_line][0]),data_info[i][0])
                if i>100:
                    total+=1
                if int(data_raw[data_line][0])==data_info[i][0]:
                    sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),y_:[1,0],y_r:[0,1]})
#                   print(sess.run(logits_2,{x:pre_process(data_info[i]).reshape(1,-1), y_: #[1,0]}))
                    data_line+=1;
                    if data_line==len(data_raw):
                        break
                    if i>100:
                        acc=accuracy.eval(feed_dict={x: pre_process(data_info[i]).reshape(1,-1), y_: [1,0], y_r:[0,1]})
                        local_local+=acc
                        local_remote+=1-acc
                        accur+=acc
                else:
                    sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),y_:[0,1], y_r:[1,0]})
#                   print(sess.run(logits_2,{x: pre_process(data_info[i]).reshape(1,-1), y_: #[0,1]}))
                    if i>100:
                        acc=accuracy.eval(feed_dict={x: pre_process(data_info[i]).reshape(1,-1), y_: [0,1], y_r:[1,0]})
                        remote_remote+=acc
                        remote_local+=1-acc
                        accur+=acc

        print("correctness: (%.3d,%.3d): \t%.2f   %.2f   %.2f   %.2f   %.2f" % (rate[0],rate[1],accur/total,local_local/total,local_remote/total,remote_local/total,remote_remote/total))

推荐答案

虽然 GPhilo 的回答解决了为什么运行代码越来越慢的问题,但实际上,该解决方案将导致一次又一次地创建计算图不好.

Though GPhilo's answer addresses the issue why running the code is getting slower and slower, but in reality, that solution will result in creation of computation graph again and again which is not good.

以下两行代码(GPhilo 也提到过)在每次迭代时不断向图形添加操作.

The following two lines of code, (GPhilo has also mentioned) are continuously adding operations to your graph for each iteration.

loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits( \
                    labels=y_, logits=logits_2)*[rate[0],rate[1]])
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

正如我所见,您有两个值 rate[0], rate[1] 需要提供给您的图表.为什么不通过 placeholder 提供这两个值并且只定义一次图形.一旦您开始运行 Session,您就不应该在图表中添加更多操作.此外,您不应该考虑初始化会话以进行迭代.

As I can see, you are having two values rate[0], rate[1] which needs to be supplied to your graph. Why are you not supplying these two values through placeholder and define your graph only once. Once you start running Session you shouldn't add more operations in your graph. Also, you shouldn't be considering initializing your Session for iteration.

检查这个修改后的代码(仅重要部分)

Check this modified code (only important parts)

#  To clear previously created graph (if any) present in memory.
tf.reset_default_graph()   
x=tf.placeholder(tf.float32, shape=[1,n_input])
y_=tf.placeholder(tf.float32,shape=[n_output])
y_r=tf.placeholder(tf.float32,shape=[n_output])

# Add these two placeholders (Assuming they are single float value)
rate0 = tf.placeholder(tf.float32, shape = []) 
rate1 = tf.placeholder(tf.float32, shape = [])

W1=tf.Variable(tf.random_uniform([n_input, n_hidden]))
....
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Bring this code outside from loop (Note replacement of rate[0] with placeholder)
loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_, \
            logits=logits_2) * [rate0, rate1])
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

# Instantiate session only once.
with tf.Session() as sess:
     sess.run(tf.global_variables_initializer())

     # Move the subsequent looping code inside.
     rate=[0,0]
     for i in range(-100,200,10):
        rate[0]=i;

此修改后,每当您的 Session 运行 train_step 时,您都需要在 feed_dict 中提供这两个额外的占位符.

After this modification, whenever your Session runs train_step, you need to supply these two extra placeholders in your feed_dict.

例如:

sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),
         y_:[1,0],y_r:[0,1], rate0: rate[0], rate1: rate[1]})

这样,您就不会为每次迭代都创建图形,事实上,此代码将比 Gphilo 的解决方案更快.

In this way, you will not be creating graph for every iteration and in fact this code will be faster than GPhilo's solution.

这篇关于Tensorflow:为什么我的代码运行越来越慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆