如何在R中绘制KNN群集边界 [英] how to plot KNN clusters boundaries in r

查看:58
本文介绍了如何在R中绘制KNN群集边界的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将虹膜数据用于K附近的邻居.我用数据中的数值替换了物种类型,即

I am using iris data for K- nearest neighbour. I have replaced species type with numerical values in data i.e

setosa = 1
versicolor = 2
virginica = 3 

现在,我正在将数据放入训练和测试集中.并在物种菌种的基础上训练该模型.

now I am diving my data into training and testing set . And training this model on the basis of species colmum.

# Clustering
WNew <- iris


# Knn Clustering Technique

library(class)
library(gmodels)
WNew[is.na(WNew)] <- 0
WSmallSet<-WNew[1:100,]
WTestSet<-WNew[100:150,] # testing set
WLabel<-c(WNew[1:100,5]) # training set
wTestLabel<-c(WNew[100:150,5])
kWset1 <- knnSet <- knn(WSmallSet,WTestSet,WLabel,k=3)
CTab<-CrossTable(x = wTestLabel, y = kWset1,prop.chisq=FALSE)

现在,我想根据其边界绘制这3个集群的边界.但我不知道该怎么做.有人可以帮我吗?

now I want to plot this 3 clusters boundaries on the basis of their boundaries. but i dont know how to do this . Can anyone help me with this.??

推荐答案

我将尽力回答这个问题.当您想使用2D散点图可视化群集时,下面的示例适用.您可以很好地将其推断为3D,但对于多维数据集,也许可以使用成对散点图?

I'll try and answer this as best as I can. The example below works when you'd like to visualize the clusters using a 2D scatter plot. You could extrapolate this to 3D a well but for multidimensional data sets, maybe use pairwise scatter plots?

请注意,我没有按原样使用您的代码,但仍在使用虹膜数据集.我这样做是为了不对行索引进行硬编码.

Note that I didn't use your code as is but I am still using the iris data set. I did this so as to not hard code row indices.

希望这会有所帮助.

library(plyr)
library(ggplot2)
set.seed(123)

# Create training and testing data sets
idx = sample(1:nrow(iris), size = 100)
train.idx = 1:nrow(iris) %in% idx
test.idx =  ! 1:nrow(iris) %in% idx

train = iris[train.idx, 1:4]
test = iris[test.idx, 1:4]

# Get labels
labels = iris[train.idx, 5]

# Do knn
fit = knn(train, test, labels)
fit

# Create a dataframe to simplify charting
plot.df = data.frame(test, predicted = fit)

# Use ggplot
# 2-D plots example only
# Sepal.Length vs Sepal.Width

# First use Convex hull to determine boundary points of each cluster
plot.df1 = data.frame(x = plot.df$Sepal.Length, 
                      y = plot.df$Sepal.Width, 
                      predicted = plot.df$predicted)

find_hull = function(df) df[chull(df$x, df$y), ]
boundary = ddply(plot.df1, .variables = "predicted", .fun = find_hull)

ggplot(plot.df, aes(Sepal.Length, Sepal.Width, color = predicted, fill = predicted)) + 
  geom_point(size = 5) + 
  geom_polygon(data = boundary, aes(x,y), alpha = 0.5)

这篇关于如何在R中绘制KNN群集边界的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆