如何找出几何平均 [英] How to find out Geometric Median

查看:167
本文介绍了如何找出几何平均的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在的问题是:

给定N点(在2D)与x和y坐标,找到一个点P(在N-   定的点),使得距离的总和与其它(N- 1)指向   P是最小

Given N points(in 2D) with x and y coordinates, find a point P (in N given points) such that the sum of distances from other(N-1) points to P is minimum.

这点通常被称为几何平均。有没有什么有效的算法来解决这个问题,不是天真的 0其他(N ^ 2)吗?

This point is commonly known as Geometric Median. Is there any efficient algorithm to solve this problem, other than the naive O(N^2) one?

推荐答案

我解决了类似的东西了本地联机法官使用模拟退火一次。这是官方的解决方案,以及和程序得到了交流。

I solved something similar for a local online judge once using simulated annealing. That was the official solution as well and the program got AC.

唯一不同的是,点我必须找到没有成为 N 给出点的一部分。

The only difference was that the point I had to find did not have to be part of the N given points.

这是我的C ++ code和 N 可能会大到 50000 。在2GHz的奔腾4 0.1S 的程序执行。

This was my C++ code, and N could be as large as 50000. The program executes in 0.1s on a 2ghz pentium 4.

// header files for IO functions and math
#include <cstdio>
#include <cmath>

// the maximul value n can take
const int maxn = 50001;

// given a point (x, y) on a grid, we can find its left/right/up/down neighbors
// by using these constants: (x + dx[0], y + dy[0]) = upper neighbor etc.
const int dx[] = {-1, 0, 1, 0};
const int dy[] = {0, 1, 0, -1};

// controls the precision - this should give you an answer accurate to 3 decimals
const double eps = 0.001;

// input and output files
FILE *in = fopen("adapost2.in","r"), *out = fopen("adapost2.out","w");

// stores a point in 2d space
struct punct
{
    double x, y;
};

// how many points are in the input file
int n;

// stores the points in the input file
punct a[maxn];

// stores the answer to the question
double x, y;

// finds the sum of (euclidean) distances from each input point to (x, y)
double dist(double x, double y)
{
    double ret = 0;

    for ( int i = 1; i <= n; ++i )
    {
        double dx = a[i].x - x;
        double dy = a[i].y - y;

        ret += sqrt(dx*dx + dy*dy); // classical distance formula
    }

    return ret;
}

// reads the input
void read()
{
    fscanf(in, "%d", &n); // read n from the first 

    // read n points next, one on each line
    for ( int i = 1; i <= n; ++i )
        fscanf(in, "%lf %lf", &a[i].x, &a[i].y), // reads a point
        x += a[i].x,
        y += a[i].y; // we add the x and y at first, because we will start by approximating the answer as the center of gravity

    // divide by the number of points (n) to get the center of gravity
    x /= n; 
    y /= n;
}

// implements the solving algorithm
void go()
{
    // start by finding the sum of distances to the center of gravity
    double d = dist(x, y);

    // our step value, chosen by experimentation
    double step = 100.0;

    // done is used to keep track of updates: if none of the neighbors of the current
    // point that are *step* steps away improve the solution, then *step* is too big
    // and we need to look closer to the current point, so we must half *step*.
    int done = 0;

    // while we still need a more precise answer
    while ( step > eps )
    {
        done = 0;
        for ( int i = 0; i < 4; ++i )
        {
            // check the neighbors in all 4 directions.
            double nx = (double)x + step*dx[i];
            double ny = (double)y + step*dy[i];

            // find the sum of distances to each neighbor
            double t = dist(nx, ny);

            // if a neighbor offers a better sum of distances
            if ( t < d )
            {
                update the current minimum
                d = t;
                x = nx;
                y = ny;

                // an improvement has been made, so
                // don't half step in the next iteration, because we might need
                // to jump the same amount again
                done = 1;
                break;
            }
        }

        // half the step size, because no update has been made, so we might have
        // jumped too much, and now we need to head back some.
        if ( !done )
            step /= 2;
    }
}

int main()
{
    read();
    go();

    // print the answer with 4 decimal points
    fprintf(out, "%.4lf %.4lf\n", x, y);

    return 0;
}

那么我认为这是正确挑选一个从你的列表中最接近(X,Y)该算法返回。

此算法利用了什么样的几何平均此维基段说:

This algorithm takes advantage of what this wikipedia paragraph on the geometric median says:

但是,它是直接计算的近似的   利用迭代过程的几何平均,其中,每个步骤   产生更准确的近似。这种类型的程序,可以   从这一事实推导出的距离的采样点的总和   是一个凸函数,因为对每一个样本点的距离为   凸凸函数的和仍凸。因此,   该减少距离的总和在每一步程序不能得到   被困在一个局部最优。

However, it is straightforward to calculate an approximation to the geometric median using an iterative procedure in which each step produces a more accurate approximation. Procedures of this type can be derived from the fact that the sum of distances to the sample points is a convex function, since the distance to each sample point is convex and the sum of convex functions remains convex. Therefore, procedures that decrease the sum of distances at each step cannot get trapped in a local optimum.

一种常见这种类型的方法,被称为   恩德雷Weiszfeld的工作后Weiszfeld的算法,[4]是一种形式   迭代地重新加权最小二乘。该算法定义了一组   权重是成反比从距离   目前估计的样本,并创建一个新的估计是   样品根据这些权重的加权平均值。那   是,

One common approach of this type, called Weiszfeld's algorithm after the work of Endre Weiszfeld,[4] is a form of iteratively re-weighted least squares. This algorithm defines a set of weights that are inversely proportional to the distances from the current estimate to the samples, and creates a new estimate that is the weighted average of the samples according to these weights. That is,

上面的第一段解释了为什么这个工程:因为我们试图优化没有任何局部极小,所以你可以贪婪地找到最小通过反复改进它的功能

The first paragraph above explains why this works: because the function we are trying to optimize does not have any local minimums, so you can greedily find the minimum by iteratively improving it.

把这看作一种​​二进制搜索。首先,近似的结果。一个好的近似将是重心,当读取输入,我的code计算。然后,你看看相邻的点这个给你一个更好的解决方案。在这种情况下,一个点被认为是相邻的,如果它为远离你的当前点的距离。如果是更好的,那么它是罚款,放弃目前的点,因为,正如我所说,这不会有陷阱你进入当地最低由于工作的性质,你正在试图尽量减少。

Think of this as a sort of binary search. First, you approximate the result. A good approximation will be the center of gravity, which my code computes when reading the input. Then, you see if adjacent points to this give you a better solution. In this case, a point is considered adjacent if it as a distance of step away from your current point. If it is better, then it is fine to discard your current point, because, as I said, this will not trap you into a local minimum because of the nature of the function you are trying to minimize.

在此,你一半的步长,就像在二进制搜索,并继续下去,直到你有你认为什么是一个足够好的近似(由 EPS控制常数)。

After this, you half the step size, just like in binary search, and continue until you have what you consider to be a good enough approximation (controlled by the eps constant).

该算法的复杂度因此取决于你想要如何精确的结果是。

The complexity of the algorithm therefore depends on how accurate you want the result to be.

这篇关于如何找出几何平均的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆