关于“如何从统计学习的元素中绘制k近邻分类器的决策边界?"的变体? [英] Variation on "How to plot decision boundary of a k-nearest neighbor classifier from Elements of Statistical Learning?"
问题描述
这是与 我一直在研究该示例,并希望尝试使其与三个类一起使用.我可以用类似的方式更改g的一些值
只是假装有些样品来自第三类.不过,我无法使该情节起作用.我想我需要更改赢得类票比例的线: 以及轮廓上的水平: 我也不确定轮廓是否是正确的工具.一种可行的替代方法是创建一个覆盖我感兴趣的区域的数据矩阵,对该矩阵的每个点进行分类,并用大的标记和不同的颜色绘制这些点,类似于对这些点所做的操作(gd .. .)位. 最终目的是能够显示由不同分类器生成的不同决策边界.有人可以指出我正确的方向吗? 谢谢
拉斐尔 分离代码中的主要部分将有助于概述如何实现此目标: 3类测试数据 覆盖网格的测试数据 该网格的分类 显然有3个课程 用于绘制的数据结构 情节 我们也可以稍微想一点点,并绘制类成员资格的概率以表示信心". This is a question related to https://stats.stackexchange.com/questions/21572/how-to-plot-decision-boundary-of-a-k-nearest-neighbor-classifier-from-elements-o For completeness, here's the original example from that link: I've been playing with that example, and would like to try to make it work with three classes. I can change some values of g with something like just to pretend that there are some samples which are from a third class. I can't make the plot work, though. I guess I need to change the lines that deal with the proportion of votes for winning class: and also the levels on the contour: I am also not sure contour is the right tool for this. One alternative that works is to create a matrix of data that covers the region I'm interested, classify each point of this matrix and plot those with a large marker and different colors, similar to what is being done with the points(gd...) bit. The final purpose is to be able to show different decision boundaries generated by different classifiers. Can someone point me to the right direction? thanks
Rafael Separating the main parts in the code will help outlining how to achieve this: Test data with 3 classes Test data covering a grid Classification for that grid 3 classes obviously Data structure for plotting Plot We can also be a little fancier and plot the probability of class membership as a indication of the "confidence". 这篇关于关于“如何从统计学习的元素中绘制k近邻分类器的决策边界?"的变体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!g[8:16] <- 2
prob <- attr(mod15, "prob")
prob <- ifelse(mod15=="1", prob, 1-prob)
contour(px1, px2, prob15, levels=0.5, labels="", xlab="", ylab="", main=
"15-nearest neighbour", axes=FALSE)
train <- rbind(iris3[1:25,1:2,1],
iris3[1:25,1:2,2],
iris3[1:25,1:2,3])
cl <- factor(c(rep("s",25), rep("c",25), rep("v",25)))
require(MASS)
test <- expand.grid(x=seq(min(train[,1]-1), max(train[,1]+1),
by=0.1),
y=seq(min(train[,2]-1), max(train[,2]+1),
by=0.1))
require(class)
classif <- knn(train, test, cl, k = 3, prob=TRUE)
prob <- attr(classif, "prob")
require(dplyr)
dataf <- bind_rows(mutate(test,
prob=prob,
cls="c",
prob_cls=ifelse(classif==cls,
1, 0)),
mutate(test,
prob=prob,
cls="v",
prob_cls=ifelse(classif==cls,
1, 0)),
mutate(test,
prob=prob,
cls="s",
prob_cls=ifelse(classif==cls,
1, 0)))
require(ggplot2)
ggplot(dataf) +
geom_point(aes(x=x, y=y, col=cls),
data = mutate(test, cls=classif),
size=1.2) +
geom_contour(aes(x=x, y=y, z=prob_cls, group=cls, color=cls),
bins=2,
data=dataf) +
geom_point(aes(x=x, y=y, col=cls),
size=3,
data=data.frame(x=train[,1], y=train[,2], cls=cl))
ggplot(dataf) +
geom_point(aes(x=x, y=y, col=cls, size=prob),
data = mutate(test, cls=classif)) +
scale_size(range=c(0.8, 2)) +
geom_contour(aes(x=x, y=y, z=prob_cls, group=cls, color=cls),
bins=2,
data=dataf) +
geom_point(aes(x=x, y=y, col=cls),
size=3,
data=data.frame(x=train[,1], y=train[,2], cls=cl)) +
geom_point(aes(x=x, y=y),
size=3, shape=1,
data=data.frame(x=train[,1], y=train[,2], cls=cl))
library(ElemStatLearn)
require(class)
x <- mixture.example$x
g <- mixture.example$y
xnew <- mixture.example$xnew
mod15 <- knn(x, xnew, g, k=15, prob=TRUE)
prob <- attr(mod15, "prob")
prob <- ifelse(mod15=="1", prob, 1-prob)
px1 <- mixture.example$px1
px2 <- mixture.example$px2
prob15 <- matrix(prob, length(px1), length(px2))
par(mar=rep(2,4))
contour(px1, px2, prob15, levels=0.5, labels="", xlab="", ylab="", main=
"15-nearest neighbour", axes=FALSE)
points(x, col=ifelse(g==1, "coral", "cornflowerblue"))
gd <- expand.grid(x=px1, y=px2)
points(gd, pch=".", cex=1.2, col=ifelse(prob15>0.5, "coral", "cornflowerblue"))
box()
g[8:16] <- 2
prob <- attr(mod15, "prob")
prob <- ifelse(mod15=="1", prob, 1-prob)
contour(px1, px2, prob15, levels=0.5, labels="", xlab="", ylab="", main=
"15-nearest neighbour", axes=FALSE)
train <- rbind(iris3[1:25,1:2,1],
iris3[1:25,1:2,2],
iris3[1:25,1:2,3])
cl <- factor(c(rep("s",25), rep("c",25), rep("v",25)))
require(MASS)
test <- expand.grid(x=seq(min(train[,1]-1), max(train[,1]+1),
by=0.1),
y=seq(min(train[,2]-1), max(train[,2]+1),
by=0.1))
require(class)
classif <- knn(train, test, cl, k = 3, prob=TRUE)
prob <- attr(classif, "prob")
require(dplyr)
dataf <- bind_rows(mutate(test,
prob=prob,
cls="c",
prob_cls=ifelse(classif==cls,
1, 0)),
mutate(test,
prob=prob,
cls="v",
prob_cls=ifelse(classif==cls,
1, 0)),
mutate(test,
prob=prob,
cls="s",
prob_cls=ifelse(classif==cls,
1, 0)))
require(ggplot2)
ggplot(dataf) +
geom_point(aes(x=x, y=y, col=cls),
data = mutate(test, cls=classif),
size=1.2) +
geom_contour(aes(x=x, y=y, z=prob_cls, group=cls, color=cls),
bins=2,
data=dataf) +
geom_point(aes(x=x, y=y, col=cls),
size=3,
data=data.frame(x=train[,1], y=train[,2], cls=cl))
ggplot(dataf) +
geom_point(aes(x=x, y=y, col=cls, size=prob),
data = mutate(test, cls=classif)) +
scale_size(range=c(0.8, 2)) +
geom_contour(aes(x=x, y=y, z=prob_cls, group=cls, color=cls),
bins=2,
data=dataf) +
geom_point(aes(x=x, y=y, col=cls),
size=3,
data=data.frame(x=train[,1], y=train[,2], cls=cl)) +
geom_point(aes(x=x, y=y),
size=3, shape=1,
data=data.frame(x=train[,1], y=train[,2], cls=cl))