在Matlab中高效计算成对平方的欧几里得距离 [英] Efficiently compute pairwise squared Euclidean distance in Matlab

查看:296
本文介绍了在Matlab中高效计算成对平方的欧几里得距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出两组d维点.如何在Matlab中最有效地计算成对平方的欧几里德距离矩阵?

Given two sets of d-dimensional points. How can I most efficiently compute the pairwise squared euclidean distance matrix in Matlab?

符号: 第一组由(numA,d)-矩阵A给出,第二组由(numB,d)-矩阵B给出.所得距离矩阵的格式应为(numA,numB).

Notation: Set one is given by a (numA,d)-matrix A and set two is given by a (numB,d)-matrix B. The resulting distance matrix shall be of the format (numA,numB).

示例要点:

d = 4;            % dimension
numA = 100;       % number of set 1 points
numB = 200;       % number of set 2 points
A = rand(numA,d); % set 1 given as matrix A
B = rand(numB,d); % set 2 given as matrix B

推荐答案

此处通常给出的答案基于bsxfun(例如,

The usually given answer here is based on bsxfun (cf. e.g. [1]). My proposed approach is based on matrix multiplication and turns out to be much faster than any comparable algorithm I could find:

helpA = zeros(numA,3*d);
helpB = zeros(numB,3*d);
for idx = 1:d
    helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,idx), A(:,idx).^2 ];
    helpB(:,3*idx-2:3*idx) = [B(:,idx).^2 ,    B(:,idx), ones(numB,1)];
end
distMat = helpA * helpB';

请注意: 对于常数d,可以用硬编码实现替换for循环,例如

Please note: For constant d one can replace the for-loop by hardcoded implementations, e.g.

helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,1), A(:,1).^2, ... % d == 2
                          ones(numA,1), -2*A(:,2), A(:,2).^2 ];   % etc.

评估:

%% create some points
d = 2; % dimension
numA = 20000;
numB = 20000;
A = rand(numA,d);
B = rand(numB,d);

%% pairwise distance matrix
% proposed method:
tic;
helpA = zeros(numA,3*d);
helpB = zeros(numB,3*d);
for idx = 1:d
    helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,idx), A(:,idx).^2 ];
    helpB(:,3*idx-2:3*idx) = [B(:,idx).^2 ,    B(:,idx), ones(numB,1)];
end
distMat = helpA * helpB';
toc;

% compare to pdist2:
tic;
pdist2(A,B).^2;
toc;

% compare to [1]:
tic;
bsxfun(@plus,dot(A,A,2),dot(B,B,2)')-2*(A*B');
toc;

% Another method: added 07/2014
% compare to ndgrid method (cf. Dan's comment)
tic;
[idxA,idxB] = ndgrid(1:numA,1:numB);
distMat = zeros(numA,numB);
distMat(:) = sum((A(idxA,:) - B(idxB,:)).^2,2);
toc;

结果:

Elapsed time is 1.796201 seconds.
Elapsed time is 5.653246 seconds.
Elapsed time is 3.551636 seconds.
Elapsed time is 22.461185 seconds.

更详细的评估数据点的尺寸和数量遵循以下讨论(@comments).事实证明,应该在不同的环境中使用不同的算法.在非时间紧迫的情况下,只需使用pdist2版本.

For a more detailed evaluation w.r.t. dimension and number of data points follow the discussion below (@comments). It turns out that different algos should be preferred in different settings. In non time critical situations just use the pdist2 version.

进一步的发展: 可以想到基于同一原理,用任何其他度量替换平方的欧几里得数:

Further development: One can think of replacing the squared euclidean by any other metric based on the same principle:

help = zeros(numA,numB,d);
for idx = 1:d
    help(:,:,idx) = [ones(numA,1), A(:,idx)     ] * ...
                    [B(:,idx)'   ; -ones(1,numB)];
end
distMat = sum(ANYFUNCTION(help),3);

尽管如此,这还是很耗时的.用d二维矩阵替换较小的d 3维矩阵help可能很有用.特别是对于d = 1,它提供了一种通过简单的矩阵乘法计算成对差异的方法:

Nevertheless, this is quite time consuming. It could be useful to replace for smaller d the 3-dimensional matrix help by d 2-dimensional matrices. Especially for d = 1 it provides a method to compute the pairwise difference by a simple matrix multiplication:

pairDiffs = [ones(numA,1), A ] * [B'; -ones(1,numB)];

您还有其他想法吗?

这篇关于在Matlab中高效计算成对平方的欧几里得距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆