K最近邻点算法 [英] K Nearest-Neighbor Algorithm

查看:268
本文介绍了K最近邻点算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

也许我是相当愚蠢的,但我无法找到一个令人满意的答案: 使用KNN算法,比如K = 5。现在,我尝试将一个不明物体通过获取其5近邻分类。怎么办,如果4个最近的邻居有不同的距离和2个或更多有相同的距离,但后4的人最近的对象?这2个或多个哪个对象应该被选为第五届近邻?

在此先感谢:)

解决方案
  

那个对象的这2个或更多的应该被选为第五届近邻?

这真的取决于你想如何实现它。

大多数算法会做的三件事之一:

  1. 在包括所有等距离点,所以对于这种估计,他们将使用6分,而不是5。
  2. 使用两个相同的遥远。
  3. 的第一次发现点
  4. 从发现2个点选择一个随机的(通常有一个一致的种子,所以结果是可重放)点。

话虽这么说,基于径向搜索最算法具有平稳的固有假设,在这种情况下,它确实不应该的问题,其中上面的选项选择。在一般情况下,任何人都应该,从理论上说,提供合理的默认值(特别是因为他们在近似的最远点,并应具有最低的有效权重)。

Maybe I'm rather stupid but I just can't find a satisfying answer: Using the KNN-algorithm, say k=5. Now I try to classify an unknown object by getting its 5 nearest neighbors. What to do, if 4 nearest neighbors have different distances and 2 or more have the same distance but are the nearest objects after the 4 ones? Which object of these 2 ore more should be chosen as the 5th nearest neighbor?

Thanks in advance :)

解决方案

Which object of these 2 or more should be chosen as the 5th nearest neighbor?

It really depends on how you want to implement it.

Most algorithms will do one of three things:

  1. Include all equal distance points, so for this estimation, they'll use 6 points, not 5.
  2. Use the "first" found point of the two equal distant.
  3. Pick a random (usually with a consistent seed, so results are reproducable) point from the 2 points found.

That being said, most algorithms based on radial searching have an inherent assumption of stationarity, in which case, it really shouldn't matter which of the options above you choose. In general, any of them should, theoretically, provide reasonable defaults (especially since they're the furthest points in the approximation, and should have the lowest effective weightings).

这篇关于K最近邻点算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆