我们可以同时使用kNN和k-mean吗? [英] Can we use kNN and k-mean at a same time?

查看:110
本文介绍了我们可以同时使用kNN和k-mean吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用kNN获取邻居的数据集,然后我想在该数据集上应用k-mean。通过使用它,我可以获得更准确的结果吗?在逻辑上是正确的,使用kNN然后使用k-mean或反之亦然?

I Get dataset of neighbours using kNN and then I want to apply k-mean on that dataset. By using this, is it possible that I get more accurate result? Is it logically correct that use kNN and then after use k-mean or vice-versa?

推荐答案

它们是不同的机器学习技术,这是一个很长的故事。我会尽力简明扼要。



KNN用于:

1.将新数据分类到已知组(类别) );或者

2.预测新数据的目标值;

它通过比较新数据的特征与已知的一组历史数据的特征之间的相似性来工作。类别或已知目标值。 K表示与其最匹配的数据的数量。最终结果可以由K组中的大多数类别在分类的情况下确定,或者在预测的情况下由简单的平均值确定。

因此,对于KNN,您需要具有已知的历史数据目标,它被称为监督机器学习。



另一方面,K-means是一种聚类算法。它的工作原理是首先将数据点分组为K个分区(或簇)。首先,随机选择K个数据点作为这些k簇的中心,然后根据它们与聚类中心之间的特征相似性将剩余的数据点分配给这些聚类。将所有数据点分组到其集群中后,每个集群将重新选择其成员中最合适的数据点作为其新的集群中心。然后,重新聚类的整个过程开始。这将持续到集群中没有重大变化。一旦识别并确定了K群集,新数据就会通过相似性匹配找到其在其中一个群集中的位置。

因此,K-means需要没有已知结果的历史数据,并且称为无监督机器学习。



结合它们:

你可以先用K-means将新数据分组到一个集群中然后再用KNN该群集中的数据点。但是你需要一个非常大的数据集。它是否产生更好的结果,取决于许多因素 - 问题的性质,数据集的质量和数量,以及K的值。你只需要探索和实验。祝你好运。



一些可能有用的阅读:组合K-Nearest Ne和K-Means的基础

对印度尼西亚新闻分类的期限重新加权

[ ^ ]
They are different machine learning techniques and it is a long story. I will try my best to be concise.

KNN is used to:
1. Classify a new data into a known group (category); or
2. Predict a target value for a new data;
It works by comparing the similarity between features of the new data and those of a set of historical data of known categories or known target values. The "K" refers to the number of data that has the closest match to it. The final outcome may be determine by the majority of category in the K group in the case of classification or a simple average in the case of prediction.
So, for KNN you need to have historical data with known targets and it is called supervised machine learning.

K-means, on the other hand, is a clustering algorithm. It works by first grouping data points into K number of partitions (or clusters). It starts by selecting K number of data points randomly as the centers of these k clusters, then assign the rest of data points to these cluster based on the features similarity between them and the cluster centers. Once all the data points are being grouped into their clusters, each cluster will re-select the most suitable data points among its members to be its new cluster centers. Then, the whole process of re-clustering begins. This will go on until there is no sigificant changes in the clusters. Once the K-clusters are identified and settled, new data then finds its place in one of these clusters through similarity matching.
so, K-means needs historical data with no known outcomes and it is called unsupervised machine learning.

Combining Them:
You may do a K-means first to group new data into a cluster and then apply KNN using the data points in that cluster. But you would need a very large dataset. Whether it produces better result, it depends on many factors - nature of the problem, the quality and quantity of the dataset, and the value of K . You just have to explore and experiment. Good luck.

Some reading that may help: Combination of K-Nearest Neighbor and K-Means based
on Term Re-weighting for Classify Indonesian News
[^]


这篇关于我们可以同时使用kNN和k-mean吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆