kmeans是否需要三角不等式? [英] Is Triangle inequality necessary for kmeans?

查看:164
本文介绍了kmeans是否需要三角不等式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道kmeans中使用的距离度量是否需要三角不等式.

I wonder if Triangle inequality is necessary for the distance measure used in kmeans.

推荐答案

k-means是针对 Euclidean 距离设计的

k-means is designed for Euclidean distance, which happens to satisfy triangle inequality.

使用其他距离功能存在风险,因为它可能会停止收敛.但是,原因不是 三角不等式,而是平均值可能不会使距离函数最小化. (算术平均最小化平方和,而不是任意距离!)

Using other distance functions is risky, as it may stop converging. The reason however is not the triangle inequality, but the mean might not minimize the distance function. (The arithmetic mean minimizes the sum-of-squares, not arbitrary distances!)

对于k均值,有更快的方法可以利用三角形不等式来避免重新计算.但是,如果您坚持使用经典的MacQueen或Lloyd k-means,那么您不需要就不需要三角形不等式.

There are faster methods for k-means that exploit the triangle inequality to avoid recomputations. But if you stick to classic MacQueen or Lloyd k-means, then you do not need the triangle inequality.

请谨慎使用其他距离函数,以免陷入无限循环.您需要证明均值可以使您到聚类中心的距离最小化.如果您无法证明这一点,则可能无法收敛,因为目标函数不再单调递减!因此,您确实应该尝试证明距离函数的收敛性

Just be careful with using other distance functions to not run into an infinite loop. You need to prove that the mean minimizes your distances to the cluster centers. If you cannot prove this, it may fail to converge, as the objective function no longer decreases monotonically! So you really should try to prove convergence for your distance function!

这篇关于kmeans是否需要三角不等式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆