为什么KNN中的k = 1可以提供最佳的准确性? [英] Why does k=1 in KNN give the best accuracy?

查看:1067
本文介绍了为什么KNN中的k = 1可以提供最佳的准确性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Weka IBk进行文本分类.每个文件基本上都是一个简短的句子.训练数据集包含15,000个文档.在测试过程中,我可以看到k = 1给出了最佳精度?如何解释呢?

I am using Weka IBk for text classificaiton. Each document basically is a short sentence. The training dataset contains 15,000 documents. While testing, I can see that k=1 gives the best accuracy? How can this be explained?

推荐答案

如果使用与k = 1训练过的相同数据集来查询学习者,则输出值应该是完美的,除非您拥有具有相同参数的数据具有不同的结果值.阅读适用于KNN学习者的过度拟合.

If you are querying your learner with the same dataset you have trained on with k=1, the output values should be perfect barring you have data with the same parameters that have different outcome values. Do some reading on overfitting as it applies to KNN learners.

在使用与训练时相同的数据集进行查询的情况下,查询将针对每个学习者使用一些给定的参数值.由于该点存在于您训练过的数据集中的学习者中,因此学习者将使该训练点与参数值最接近,从而输出该训练点存在的任何Y值,在这种情况下,该Y值与您查询.

In the case where you are querying with the same dataset as you trained with, the query will come in for each learner with some given parameter values. Because that point exists in the learner from the dataset you trained with, the learner will match that training point as closest to the parameter values and therefore output whatever Y value existed for that training point, which in this case is the same as the point you queried with.

这篇关于为什么KNN中的k = 1可以提供最佳的准确性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆