TensorFlow apply_gradients远程 [英] TensorFlow apply_gradients remotely

查看:115
本文介绍了TensorFlow apply_gradients远程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将最小化功能拆分到两台计算机上.在一台计算机上,我称为"compute_gradients",在另一台计算机上,我称为"apply_gradients",其渐变是通过网络发送的.问题是无论我做什么,调用apply_gradients(...).run(feed_dict)似乎都不起作用.我尝试插入占位符来代替apply_gradients的张量梯度,

I'm trying to split up the minimize function over two machines. On one machine, I'm calling "compute_gradients", on another I call "apply_gradients" with gradients that were sent over the network. The issue is that calling apply_gradients(...).run(feed_dict) doesn't seem to work no matter what I do. I've tried inserting placeholders in place of the tensor gradients for apply_gradients,

  variables = [W_conv1, b_conv1, W_conv2, b_conv2, W_fc1, b_fc1, W_fc2, b_fc2]
  loss = -tf.reduce_sum(y_ * tf.log(y_conv))
  optimizer = tf.train.AdamOptimizer(1e-4)
  correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
  accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
  compute_gradients = optimizer.compute_gradients(loss, variables)

  placeholder_gradients = []
  for grad_var in compute_gradients:
     placeholder_gradients.append((tf.placeholder('float', shape=grad_var[1].get_shape()) ,grad_var[1]))
  apply_gradients = optimizer.apply_gradients(placeholder_gradients)

然后,当我收到我调用的渐变

then later when I receive the gradients I call

  feed_dict = {}
  for i, grad_var in enumerate(compute_gradients):
        feed_dict[placeholder_gradients[i][0]] = tf.convert_to_tensor(gradients[i])
  apply_gradients.run(feed_dict=feed_dict)

但是,当我这样做时,我会得到

However, when I do this, I get

ValueError:设置具有序列的数组元素.

ValueError: setting an array element with a sequence.

这只是我尝试过的最新操作,我也尝试了不带占位符的相同解决方案,并等待创建apply_gradients操作,直到接收到渐变,这会导致图形不匹配.

This is only the latest thing I've tried, I've also tried the same solution without placeholders, as well as waiting to create the apply_gradients operation until I receive the gradients, which results in non-matching graph errors.

我应该朝哪个方向寻求帮助?

Any help on which direction I should go with this?

推荐答案

假设每个gradients[i]是使用某种带外机制获取的NumPy数组,则解决方法就是删除feed_dict时调用href ="https://www.tensorflow.org/versions/master/api_docs/python/framework.html#convert_to_tensor" rel ="noreferrer"> tf.convert_to_tensor() :

Assuming that each gradients[i] is a NumPy array that you've fetched using some out-of-band mechanism, the fix is simply to remove the tf.convert_to_tensor() invocation when building feed_dict:

feed_dict = {}
for i, grad_var in enumerate(compute_gradients):
    feed_dict[placeholder_gradients[i][0]] = gradients[i]
apply_gradients.run(feed_dict=feed_dict)

feed_dict 应该是一个NumPy数组(或其他一些可以简单转换为NumPy数组的对象).特别地,tf.Tensor对于feed_dict而言不是有效值.

Each value in a feed_dict should be a NumPy array (or some other object that is trivially convertible to a NumPy array). In particular, a tf.Tensor is not a valid value for a feed_dict.

这篇关于TensorFlow apply_gradients远程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆