TensorFlow apply_gradients远程 [英] TensorFlow apply_gradients remotely
问题描述
我正在尝试将最小化功能拆分到两台计算机上.在一台计算机上,我称为"compute_gradients",在另一台计算机上,我称为"apply_gradients",其渐变是通过网络发送的.问题是无论我做什么,调用apply_gradients(...).run(feed_dict)似乎都不起作用.我尝试插入占位符来代替apply_gradients的张量梯度,
I'm trying to split up the minimize function over two machines. On one machine, I'm calling "compute_gradients", on another I call "apply_gradients" with gradients that were sent over the network. The issue is that calling apply_gradients(...).run(feed_dict) doesn't seem to work no matter what I do. I've tried inserting placeholders in place of the tensor gradients for apply_gradients,
variables = [W_conv1, b_conv1, W_conv2, b_conv2, W_fc1, b_fc1, W_fc2, b_fc2]
loss = -tf.reduce_sum(y_ * tf.log(y_conv))
optimizer = tf.train.AdamOptimizer(1e-4)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
compute_gradients = optimizer.compute_gradients(loss, variables)
placeholder_gradients = []
for grad_var in compute_gradients:
placeholder_gradients.append((tf.placeholder('float', shape=grad_var[1].get_shape()) ,grad_var[1]))
apply_gradients = optimizer.apply_gradients(placeholder_gradients)
然后,当我收到我调用的渐变
then later when I receive the gradients I call
feed_dict = {}
for i, grad_var in enumerate(compute_gradients):
feed_dict[placeholder_gradients[i][0]] = tf.convert_to_tensor(gradients[i])
apply_gradients.run(feed_dict=feed_dict)
但是,当我这样做时,我会得到
However, when I do this, I get
ValueError:设置具有序列的数组元素.
ValueError: setting an array element with a sequence.
这只是我尝试过的最新操作,我也尝试了不带占位符的相同解决方案,并等待创建apply_gradients操作,直到接收到渐变,这会导致图形不匹配.
This is only the latest thing I've tried, I've also tried the same solution without placeholders, as well as waiting to create the apply_gradients operation until I receive the gradients, which results in non-matching graph errors.
我应该朝哪个方向寻求帮助?
Any help on which direction I should go with this?
推荐答案
假设每个gradients[i]
是使用某种带外机制获取的NumPy数组,则解决方法就是删除feed_dict时调用href ="https://www.tensorflow.org/versions/master/api_docs/python/framework.html#convert_to_tensor" rel ="noreferrer"> tf.convert_to_tensor()
:
Assuming that each gradients[i]
is a NumPy array that you've fetched using some out-of-band mechanism, the fix is simply to remove the tf.convert_to_tensor()
invocation when building feed_dict
:
feed_dict = {}
for i, grad_var in enumerate(compute_gradients):
feed_dict[placeholder_gradients[i][0]] = gradients[i]
apply_gradients.run(feed_dict=feed_dict)
feed_dict
应该是一个NumPy数组(或其他一些可以简单转换为NumPy数组的对象).特别地,tf.Tensor
对于feed_dict
而言不是有效值.
Each value in a feed_dict
should be a NumPy array (or some other object that is trivially convertible to a NumPy array). In particular, a tf.Tensor
is not a valid value for a feed_dict
.
这篇关于TensorFlow apply_gradients远程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!