TensorFlow 中的连体神经网络 [英] Siamese Neural Network in TensorFlow
问题描述
我正在尝试在 TensorFlow 中实现一个连体神经网络,但我真的在互联网上找不到任何工作示例(请参阅
我尝试构建的架构将由两个共享权重且仅在网络末端连接的 LSTM 组成.
我的问题是:如何在 TensorFlow 中构建两个不同的神经网络共享它们的权重(绑定权重),以及最后如何将它们连接起来?
谢谢:)
编辑:我实现了一个简单且有效的 siamese 网络示例 此处 MNIST.
Update with tf.layers
如果你使用 tf.layers
模块来构建你的网络,你可以简单地为 Siamese 网络的第二部分使用参数 reuse=True
:>
x = tf.ones((1, 3))y1 = tf.layers.dense(x, 4, name='h1')y2 = tf.layers.dense(x, 4, name='h1',reuse=True)# y1 和 y2 将评估为相同的值sess = tf.Session()sess.run(tf.global_variables_initializer())打印(sess.run(y1))print(sess.run(y2)) # 两次打印都将返回相同的值
<小时>
tf.get_variable
的旧答案
您可以尝试使用函数tf.get_variable()
.(请参阅教程)
使用带有 reuse=False
的变量作用域实现第一个网络:
with tf.variable_scope('Inference', repeat=False):weights_1 = tf.get_variable('权重', shape=[1, 1],初始值设定项=...)output_1 = weights_1 * input_1
然后使用相同的代码实现第二个,除了使用 reuse=True
with tf.variable_scope('Inference', repeat=True):weights_2 = tf.get_variable('权重')output_2 = weights_2 * input_2
第一个实现将创建和初始化 LSTM 的每个变量,而第二个实现将使用 tf.get_variable()
来获取第一个网络中使用的相同变量.这样,变量将共享.
然后你只需要使用你想要的任何损失(例如你可以使用两个 siamese 网络之间的 L2 距离),并且梯度将通过两个网络反向传播,使用 梯度总和更新共享变量.
I'm trying to implement a Siamese Neural Network in TensorFlow but I cannot really find any working example on the Internet (see Yann LeCun paper).
The architecture I'm trying to build would consist of two LSTMs sharing weights and only connected at the end of the network.
My question is: how to build two different neural networks sharing their weights (tied weights) in TensorFlow and how to connect them at the end?
Thanks :)
Edit: I implemented a simple and working example of a siamese network here on MNIST.
Update with tf.layers
If you use the tf.layers
module to build your network, you can simply use the argument reuse=True
for the second part of the Siamese network:
x = tf.ones((1, 3))
y1 = tf.layers.dense(x, 4, name='h1')
y2 = tf.layers.dense(x, 4, name='h1', reuse=True)
# y1 and y2 will evaluate to the same values
sess = tf.Session()
sess.run(tf.global_variables_initializer())
print(sess.run(y1))
print(sess.run(y2)) # both prints will return the same values
Old answer with tf.get_variable
You can try using the function tf.get_variable()
. (See the tutorial)
Implement the first network using a variable scope with reuse=False
:
with tf.variable_scope('Inference', reuse=False):
weights_1 = tf.get_variable('weights', shape=[1, 1],
initializer=...)
output_1 = weights_1 * input_1
Then implement the second with the same code except using reuse=True
with tf.variable_scope('Inference', reuse=True):
weights_2 = tf.get_variable('weights')
output_2 = weights_2 * input_2
The first implementation will create and initialize every variable of the LSTM, whereas the second implementation will use tf.get_variable()
to get the same variables used in the first network. That way, variables will be shared.
Then you just have to use whatever loss you want (e.g. you can use the L2 distance between the two siamese networks), and the gradients will backpropagate through both networks, updating the shared variables with the sum of the gradients.
这篇关于TensorFlow 中的连体神经网络的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!