有效地计算MATLAB加权距离 [英] Efficiently calculating weighted distance in MATLAB

查看：2292 发布时间：2016/6/2 21:56:14 arrays performance matlab matrix distance

本文介绍了有效地计算MATLAB加权距离的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

<一个href=\"http://stackoverflow.com/questions/23911670/efficiently-compute-pairwise-squared-euclidean-distance-in-matlab\">Several <一href=\"http://stackoverflow.com/questions/25780633/calculating-euclidean-distance-of-pairs-of-3d-points-in-matlab\">posts 存在约高效计算成对距离在MATLAB。这些职位往往关注快速计算大量的点之间的欧氏距离。

Several posts exist about efficiently calculating pairwise distances in MATLAB. These posts tend to concern quickly calculating euclidean distance between large numbers of points.

我需要创建并迅速计算出点数量较少（一般少于1000对）之间的两两不同的功能。在节目我写的宏大计划，该功能将被执行数千次，所以即使在效率小的收益是很重要的。该功能需要灵活有两种方式：

I need to create a function which quickly calculates the pairwise differences between smaller numbers of points (typically less than 1000 pairs). Within the grander scheme of the program i am writing, this function will be executed many thousands of times, so even small gains in efficiency are important. The function needs to be flexible in two ways:

在任何给定的调用，距离度量可以欧几里德或城市街区。

的数据的尺寸被加权。

据我所知，没有办法解决这个特殊问题已经公布。该statstics工具箱提供 pdist 并的 pdist2 ，该接受多种不同距离的功能，但不加权。我已经看到了这些功能，允许进行加权扩展，但这些扩展不允许用户选择不同的距离函数。

As far as i can tell, no solution to this particular problem has been posted. The statstics toolbox offers pdist and pdist2, which accept many different distance functions, but not weighting. I have seen extensions of these functions that allow for weighting, but these extensions do not allow users to select different distance functions.

在理想情况下，我想避免使用从统计工具箱功能（我不能肯定该功能的用户将有机会获得这些工具箱）。

Ideally, i would like to avoid using functions from the statistics toolbox (i am not certain the user of the function will have access to those toolboxes).

我已经写了两个函数来完成此任务。第一种使用棘手调用repmat和置换，而第二只需使用for循环

I have written two functions to accomplish this task. The first uses tricky calls to repmat and permute, and the second simply uses for-loops.

function [D] = pairdist1(A, B, wts, distancemetric)

% get some information about the data
    numA = size(A,1);
    numB = size(B,1);

    if strcmp(distancemetric,'cityblock')
        r=1;
    elseif strcmp(distancemetric,'euclidean')
        r=2;
    else error('Function only accepts "cityblock" and "euclidean" distance')
    end

%   format weights for multiplication
    wts = repmat(wts,[numA,1,numB]);

%   get featural differences between A and B pairs
    A = repmat(A,[1 1 numB]);
    B = repmat(permute(B,[3,2,1]),[numA,1,1]);
    differences = abs(A-B).^r;

%   weigh difference values before combining them
    differences = differences.*wts;
    differences = differences.^(1/r);

%   combine features to get distance
    D = permute(sum(differences,2),[1,3,2]);
end

和

function [D] = pairdist2(A, B, wts, distancemetric)

% get some information about the data
    numA = size(A,1);
    numB = size(B,1);

    if strcmp(distancemetric,'cityblock')
        r=1;
    elseif strcmp(distancemetric,'euclidean')
        r=2;
    else error('Function only accepts "cityblock" and "euclidean" distance')
    end

%   use for-loops to generate differences
    D = zeros(numA,numB);
    for i=1:numA
        for j=1:numB
            differences = abs(A(i,:) - B(j,:)).^(1/r);
            differences = differences.*wts;
            differences = differences.^(1/r);    
            D(i,j) = sum(differences,2);
        end
    end
end

下面是性能测试：

A = rand(10,3);
B = rand(80,3);
wts = [0.1 0.5 0.4];
distancemetric = 'cityblock';


tic
D1 = pairdist1(A,B,wts,distancemetric);
toc

tic
D2 = pairdist2(A,B,wts,distancemetric);
toc

Elapsed time is 0.000238 seconds.
Elapsed time is 0.005350 seconds.

及其清楚，repmat和 - 置换版本更快地工作比双for循环版本，至少对于小数据集。但我也知道，调用repmat往往慢下来，但是。所以我想知道是否有人在SO社区有什么建议提供给任何改善的功能效率！

Its clear that the repmat-and-permute version works much more quickly than the double-for-loop version, at least for smaller datasets. But i also know that calls to repmat often slow things down, however. So I am wondering if anyone in the SO community has any advice to offer to improve the efficiency of either function!

@Luis Mendo使用所提供的repmat-和置换功能的一个很好的清理 bsxfun 。

@Luis Mendo offered a nice cleanup of the repmat-and-permute function using bsxfun. I compared his function with my original on datasets of varying size:

随着数据变大，bsxfun版本将成为明显的赢家！

As the data become larger, the bsxfun version becomes the clear winner!

我已经写完的功能，它可以在github [链接。我最终找到一个pretty好矢量方法计算欧几里得距离[的链接]，所以我使用该方法在欧几里得的情况下，我就拿@ Divakar的建议对于城市街区。它仍然是不一样快pdist2，但它必须比任何我在这个岗位奠定了早期的办法更快，容易接受权重。

I have finished writing the function and it is available on github [link]. I ended up finding a pretty good vectorized method for computing euclidean distance [link], so i use that method in the euclidean case, and i took @Divakar's advice for city-block. It is still not as fast as pdist2, but its must faster than either of the approaches i laid out earlier in this post, and easily accepts weightings.

有效地计算MATLAB加权距离 [英] Efficiently calculating weighted distance in MATLAB

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

有效地计算MATLAB加权距离 [英] Efficiently calculating weighted distance in MATLAB

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭