批量计算成对距离而不在 Tensorflow 中复制张量? [英] Compute pairwise distance in a batch without replicating tensor in Tensorflow?

查看:47
本文介绍了批量计算成对距离而不在 Tensorflow 中复制张量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 Tensorflow 中计算一批特征的成对平方距离.我有一个使用 + 和 * 操作的简单实现平铺原始张量:

I want to compute the pairwise square distance of a batch of feature in Tensorflow. I have a simple implementation using + and * operations by tiling the original tensor :

def pairwise_l2_norm2(x, y, scope=None):
    with tf.op_scope([x, y], scope, 'pairwise_l2_norm2'):
        size_x = tf.shape(x)[0]
        size_y = tf.shape(y)[0]
        xx = tf.expand_dims(x, -1)
        xx = tf.tile(xx, tf.pack([1, 1, size_y]))

        yy = tf.expand_dims(y, -1)
        yy = tf.tile(yy, tf.pack([1, 1, size_x]))
        yy = tf.transpose(yy, perm=[2, 1, 0])

        diff = tf.sub(xx, yy)
        square_diff = tf.square(diff)

        square_dist = tf.reduce_sum(square_diff, 1)

        return square_dist

此函数将两个大小为 (m,d) 和 (n,d) 的矩阵作为输入,并计算每个行向量之间的平方距离.输出是一个大小为 (m,n) 的矩阵,元素为 'd_ij = dist(x_i, y_j)'.

This function takes as input two matrices of size (m,d) and (n,d) and compute the squared distance between each row vector. The output is a matrix of size (m,n) with element 'd_ij = dist(x_i, y_j)'.

问题是我有一个大批量和高亮度特征'm,n,d'复制张量消耗大量内存.我正在寻找另一种方法来实现这一点,而不会增加内存使用量,而只存储最终的距离张量.原始张量的双重循环.

The problem is that I have a large batch and high dim features 'm, n, d' replicating the tensor consume a lot of memory. I'm looking for another way to implement this without increasing the memory usage and just only store the final distance tensor. Kind of double looping the original tensor.

推荐答案

您可以使用一些线性代数将其转换为矩阵运算.请注意,您需要矩阵 D 其中 a[i] 是原始矩阵的 i 行和

You can use some linear algebra to turn it into matrix ops. Note that what you need matrix D where a[i] is the ith row of your original matrix and

D[i,j] = (a[i]-a[j])(a[i]-a[j])'

你可以把它改写成

D[i,j] = r[i] - 2 a[i]a[j]' + r[j]

其中 r[i] 是原始矩阵第 i 行的平方范数.

Where r[i] is squared norm of ith row of the original matrix.

在支持标准广播规则的系统中,您可以处理r 作为列向量,将 D 写成

In a system that supports standard broadcasting rules you can treat r as a column vector and write D as

D = r - 2 A A' + r'

在 TensorFlow 中你可以这样写

In TensorFlow you could write this as

A = tf.constant([[1, 1], [2, 2], [3, 3]])
r = tf.reduce_sum(A*A, 1)

# turn r into column vector
r = tf.reshape(r, [-1, 1])
D = r - 2*tf.matmul(A, tf.transpose(A)) + tf.transpose(r)
sess = tf.Session()
sess.run(D)

结果

array([[0, 2, 8],
       [2, 0, 2],
       [8, 2, 0]], dtype=int32)

这篇关于批量计算成对距离而不在 Tensorflow 中复制张量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆