如何从一组N个对象中选择n个对象，从而最大化它们之间的成对距离之和 [英] How to select n objects from a set of N objects, maximizing the sum of pairwise distances between them

查看：112 发布时间：2020/4/30 12:18:06 r optimization linear-programming euclidean-distance

本文介绍了如何从一组N个对象中选择n个对象，从而最大化它们之间的成对距离之和的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您有一组N = 400个对象，每个对象在例如19维的空间中都有自己的坐标.

You have a set of N=400 objects, each having its own coordinates in a, say, 19-dimensional space.

您计算(欧几里得)距离矩阵(所有成对距离).

You calculate the (Euclidean) distance matrix (all pairwise distances).

现在，您要选择n = 50个对象，以使所选对象之间的所有成对距离之和最大.

Now you want to select n=50 objects, such that the sum of all pairwise distances between the selected objects is maximal.

我设计了一种通过线性编程来解决此问题的方法(下面的代码为一个较小的示例)，但对我来说似乎效率不高，因为我使用的是N *(N-1)/2个二进制变量，对应于所有距离矩阵的非冗余元素，然后进行大量约束以确保解矢量的自洽性.

I devised a way to solve this by linear programming (code below, for a smaller example), but it seems inefficient to me, because I am using N*(N-1)/2 binary variables, corresponding to all the non-redundant elements of the distance matrix, and then a lot of constraints to ensure self-consistency of the solution vector.

我怀疑必须有一种更简单的方法，其中仅使用N个变量，但我不能立即想到一个.

I suspect there must be a simpler approach, where only N variables are used, but I can't immediately think of one.

这篇文章简要提到了一些"Bron–Kerbosch"算法，该算法显然解决了距离求和部分.
但是在该示例中，距离的总和是一个特定的数字，因此我看不到直接适用于我的情况.

This post briefly mentions some 'Bron–Kerbosch' algorithm, which apparently addresses the distance sum part.
But in that example the sum of distances is a specific number, so I don't see a direct application to my case.

我对二次编程进行了简要介绍，但是再说一次，我也看不到与我的案例的立即并行，尽管从理论上讲，'b％*％bT'矩阵(其中b是(列)二进制解矢量)用于相乘距离矩阵等；但是我真的不熟悉这种技术.

I had a brief look at quadratic programming, but again I could not see the immediate parallel with my case, although the 'b %*% bT' matrix, where b is the (column) binary solution vector, could in theory be used to multiply the distance matrix, etc.; but I'm really not familiar with this technique.

任何人都可以建议(/指向其他文章进行解释)是否以及如何通过仅使用N个二进制变量的线性编程来解决这种问题?
还是就如何更有效地解决问题提供其他建议?

Could anyone please advise (/point me to other posts explaining) if and how this kind of problem can be solved by linear programming using only N binary variables?
Or provide any other advice on how to tackle the problem more efficiently?

谢谢！

PS:这是我上面提到的代码.

PS: here's the code I referred to above.

require(Matrix)

#distmat defined manually for this example as a sparseMatrix
distmat <- sparseMatrix(i=c(rep(1,4),rep(2,3),rep(3,2),rep(4,1)),j=c(2:5,3:5,4:5,5:5),x=c(0.3,0.2,0.9,0.5,0.1,0.8,0.75,0.6,0.6,0.15))

N = 5
n = 3

distmat_summary <- summary(distmat)
distmat_summary["ID"] <- 1:NROW(distmat_summary)
i.mat <- xtabs(~i+ID,distmat_summary,sparse=T)
j.mat <- xtabs(~j+ID,distmat_summary,sparse=T)
ij.mat <- rbind(i.mat,"5"=rep(0,10))+rbind("1"=rep(0,10),j.mat)
ij.mat.rowSums <- rowSums(ij.mat)
ij.diag.mat <- .sparseDiagonal(n=length(ij.mat.rowSums),-ij.mat.rowSums)
colnames(ij.diag.mat) <- dimnames(ij.mat)[[1]]
mat <- rbind(cbind(ij.mat,ij.diag.mat),cbind(ij.mat,ij.diag.mat),c(rep(0,NCOL(ij.mat)),rep(1,NROW(ij.mat)) ))

dir <- c(rep("<=",NROW(ij.mat)),rep(">=",NROW(ij.mat)),"==")
rhs <- c(rep(0,NROW(ij.mat)),1-unname(ij.mat.rowSums),n)

obj <- xtabs(x~ID,distmat_summary)
obj <- c(obj,setNames(rep(0, NROW(ij.mat)), dimnames(ij.mat)[[1]]))

if (length(find.package(package="Rsymphony",quiet=TRUE))==0) install.packages("Rsymphony")
require(Rsymphony)
LP.sol <- Rsymphony_solve_LP(obj,mat,dir,rhs,types="B",max=TRUE)
items.sol <- (names(obj)[(1+NCOL(ij.mat)):(NCOL(ij.mat)+NROW(ij.mat))])[as.logical(LP.sol$solution[(1+NCOL(ij.mat)):(NCOL(ij.mat)+NROW(ij.mat))])]
items.sol
ID.sol <- names(obj)[1:NCOL(ij.mat)][as.logical(LP.sol$solution[1:NCOL(ij.mat)])]
as.data.frame(distmat_summary[distmat_summary$ID %in% ID.sol,])

如何从一组N个对象中选择n个对象，从而最大化它们之间的成对距离之和 [英] How to select n objects from a set of N objects, maximizing the sum of pairwise distances between them

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何从一组N个对象中选择n个对象，从而最大化它们之间的成对距离之和 [英] How to select n objects from a set of N objects, maximizing the sum of pairwise distances between them

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭