使到一组 n 个点的欧氏距离之和最小的点 [英] The point that minimizes the sum of euclidean distances to a set of n points

查看:53
本文介绍了使到一组 n 个点的欧氏距离之和最小的点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在二维平面上有一组点 W={(x1, y1), (x2, y2),..., (xn, yn)}.你能找到一种算法,将这些点作为输入并返回二维平面上的一个点 (x, y),该点与 W?换句话说,如果

di = Euclidean_distance((x, y), (xi, yi))

我想最小化:

d1 + d2 + ... + dn

解决方案

问题

您正在寻找

我们可以在 Python 中实现它,如下所示:

将 numpy 导入为 np从 numpy.linalg 导入规范作为 npnormc_pt_old = np.random.rand(2)c_pt_new = np.array([0,0])而 npnorm(c_pt_old-c_pt_new)>1e-6:数量 = 0分母 = 0对于我在范围内(POINT_NUM):dist = npnorm(c_pt_new-pts[i,:])数量 += pts[i,:]/distdenom += 1/distc_pt_old = c_pt_newc_pt_new = 数字/分名打印(c_pt_new)

Weiszfeld 的算法有可能不会收敛,因此最好从不同的起点运行几次.

通用解决方案

您也可以使用二阶锥规划 (SOCP) 找到此信息.除了解决您的特定问题之外,此通用公式还允许您轻松添加约束和权重,例如每个数据点位置的可变不确定性.

为此,您创建了许多指示变量,代表建议的中心点和数据点之间的距离.

然后最小化指标变量的总和.结果如下

导入cvxpy为cp将 numpy 导入为 np导入 matplotlib.pyplot 作为 plt#生成随机测试数据POINT_NUM = 100pts = np.random.rand(POINT_NUM,2)c_pt = cp.Variable(2) #我们要定位的中心点distances = cp.Variable(POINT_NUM) #中心点到每个数据点的距离#生成约束.这些用于保持距离.约束 = []对于我在范围内(POINT_NUM):约束.append(cp.norm(c_pt-pts[i,:])<=distances[i])目标 = cp.Minimize(cp.sum(distances))问题 = cp.Problem(目标,约束)最优值 = 问题.解决()打印(最优值 = {0}".格式(最优值))print("最佳位置 = {0}".format(c_pt.value))plt.scatter(x=pts[:,0], y=pts[:,1], s=1)plt.scatter(c_pt.value[0], c_pt.value[1], s=10)plt.show()

SOCP 在求解器数量中可用包括 CPLEX、Elemental、ECOS、ECOS_BB、GUROBI、MOSEK、CVXOPT 和 SCS.

我已经测试过,两种方法在公差范围内给出了相同的答案.

<块引用>

Weiszfeld, E. (1937).Sur le point pour lequel la somme des distances de n points donnes est minimum".东北数学杂志.43:355–386.

I have a set of points W={(x1, y1), (x2, y2),..., (xn, yn)} on the 2D plane. Can you find an algorithm that takes these points as the input and returns a point (x, y) on the 2D plane which has the minimum sum of distances from the points in W? In other words, if

di = Euclidean_distance((x, y), (xi, yi))

I want to minimize:

d1 + d2 + ... + dn

解决方案

The Problem

You're looking for the geometric median.

An Easy Solution

There is no closed-form solution to this problem, so iterative or probabilistic methods are used. The easiest way to find this is probably with Weiszfeld's algorithm:

We can implement this in Python as follows:

import numpy as np
from numpy.linalg import norm as npnorm
c_pt_old = np.random.rand(2)
c_pt_new = np.array([0,0])

while npnorm(c_pt_old-c_pt_new)>1e-6:
    num   = 0
    denom = 0
    for i in range(POINT_NUM):
        dist   = npnorm(c_pt_new-pts[i,:])
        num   += pts[i,:]/dist
        denom += 1/dist
    c_pt_old = c_pt_new
    c_pt_new = num/denom

print(c_pt_new)

There's a chance that Weiszfeld's algorithm won't converge, so it might be best to run it several times from different starting points.

A General Solution

You can also find this using second-order cone programming (SOCP). In addition to solving your specific problem, this general formulation then allows you to easily add constraints and weightings, such as variable uncertainty in the location of each data point.

To do so, you create a number of indicator variables representing the distance between the proposed center point and the data points.

You then minimize the sum of the indicator variables. The result follows

import cvxpy as cp
import numpy as np
import matplotlib.pyplot as plt

#Generate random test data
POINT_NUM = 100
pts       = np.random.rand(POINT_NUM,2)

c_pt      = cp.Variable(2)           #The center point we wish to locate
distances = cp.Variable(POINT_NUM)   #Distance from the center point to each data point

#Generate constraints. These are used to hold distances.
constraints = []                     
for i in range(POINT_NUM):
    constraints.append( cp.norm(c_pt-pts[i,:])<=distances[i] ) 

objective = cp.Minimize(cp.sum(distances))

problem = cp.Problem(objective,constraints)

optimal_value = problem.solve()

print("Optimal value = {0}".format(optimal_value))
print("Optimal location = {0}".format(c_pt.value))

plt.scatter(x=pts[:,0], y=pts[:,1], s=1)
plt.scatter(c_pt.value[0], c_pt.value[1], s=10)
plt.show()

SOCPs are available in a number of solvers including CPLEX, Elemental, ECOS, ECOS_BB, GUROBI, MOSEK, CVXOPT, and SCS.

I've tested and the two approaches give the same answers to within tolerance.

Weiszfeld, E. (1937). "Sur le point pour lequel la somme des distances de n points donnes est minimum". Tohoku Mathematical Journal. 43: 355–386.

这篇关于使到一组 n 个点的欧氏距离之和最小的点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆