R - 将质心添加到散点图 [英] R - add centroids to scatter plot

查看:718
本文介绍了R - 将质心添加到散点图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集两个连续变量和一个因子变量(两个类)。我想创建一个带有两个质心(每个类一个)的scatterplot,其中包含R中的误差条。质心应该定位在每个类的x和y的平均值。



我可以使用ggplot2轻松创建散点图,但我无法弄清楚如何添加质心。使用ggplot / qplot可以做到这一点吗?



以下是一些示例代码:

  x < -  c (1,2,3,4,5,2,3,5)
y< -c(10,11,14,5,7,9,8,5)
class< - c(1,1,1,0,0,1,0,0)
df< - data.frame(class,x,y)
qplot(x,y,data = df, color = as.factor(class))


解决方案

你在想什么?



  centroids<  -  aggregate(cbind(x,y)〜class,df,mean)
ggplot(df,aes x,y,color = factor(class)))+
geom_point(size = 3)+ geom_point(data = centroids,size = 5)

这创建了一个单独的数据框, centroids ,其中列 x y class 其中 x y 是按类别的平均值。然后我们使用 centroid 作为数据集添加第二个点几何图层。



这是一个稍微有趣的版本,用于聚类分析。



  gg < -  merge(df,aggregate(cbind(mean.x = x,mean.y = y)〜class,df,mean ),by =class)
ggplot(gg,aes(x,y,color = factor(class)))+ geom_point(size = 3)+
geom_point(aes(x = mean) x,y = mean.y),size = 5)+
geom_segment(aes(x = mean.x,y = mean.y,xend = x,yend = y))

编辑对OP的评论做出回应



可以使用 geom_errorbar(...) geom_errorbarh(...)来添加垂直和水平误差条。 / b>

 质心<  - 聚合(cbind(x,y)〜class,df,mean)
f < - 函数(z)sd(z)/ sqrt(length(z))#函数来计算std.err
se < - aggregate(cbind(se.x = x,se.y = y)〜class,df ,f)
centro ids< - merge(centroids,se,by =class)#将std.err列添加到质心
ggplot(gg,aes(x,y,color = factor(class)))+
geom_point(size = 3)+
geom_point(data = centroids,size = 5)+
geom_errorbar(data = centroids,aes(ymin = y-se.y,ymax = y + se.y ),width = 0.1)+
geom_errorbarh(data = centroids,aes(xmin = x-se.x,xmax = x + se.x),height = 0.1)



如果您想计算95%的置信度,而不是std。错误,将

  f < - 函数(z)sd(z)/ sqrt(length(z))#函数替换为计算std.err 

with

<$ p $ (z)qt(0.025,df =长度(z)-1,lower.tail = F)* sd(z)/ sqrt(长度(z))


I have a dataset two continuous variables and one factor variable (two classes). I want to create a scatterplot with two centroids (one for each class) that includes error bars in R. The centroids should be positioned at the mean values for x and y for each class.

I can easily create the scatter plot using ggplot2, but I can't figure out how to add the centroids. Is it possible to do this using ggplot / qplot?

Here is some example code:

x <- c(1,2,3,4,5,2,3,5)
y <- c(10,11,14,5,7,9,8,5)
class <- c(1,1,1,0,0,1,0,0)
df <- data.frame(class, x, y)
qplot(x,y, data=df, color=as.factor(class))

解决方案

Is this what you had in mind?

centroids <- aggregate(cbind(x,y)~class,df,mean)
ggplot(df,aes(x,y,color=factor(class))) +
  geom_point(size=3)+ geom_point(data=centroids,size=5)

This creates a separate data frame, centroids, with columns x, y, and class where x and y are the mean values by class. Then we add a second point geometry layer using centroid as the dataset.

This is a slightly more interesting version, useful in cluster analysis.

gg <- merge(df,aggregate(cbind(mean.x=x,mean.y=y)~class,df,mean),by="class")
ggplot(gg, aes(x,y,color=factor(class)))+geom_point(size=3)+
  geom_point(aes(x=mean.x,y=mean.y),size=5)+
  geom_segment(aes(x=mean.x, y=mean.y, xend=x, yend=y))

EDIT Response to OP's comment.

Vertical and horizontal error bars can be added using geom_errorbar(...) and geom_errorbarh(...).

centroids <- aggregate(cbind(x,y)~class,df,mean)
f         <- function(z)sd(z)/sqrt(length(z)) # function to calculate std.err
se        <- aggregate(cbind(se.x=x,se.y=y)~class,df,f)
centroids <- merge(centroids,se, by="class")    # add std.err column to centroids
ggplot(gg, aes(x,y,color=factor(class)))+
  geom_point(size=3)+
  geom_point(data=centroids, size=5)+
  geom_errorbar(data=centroids,aes(ymin=y-se.y,ymax=y+se.y),width=0.1)+
  geom_errorbarh(data=centroids,aes(xmin=x-se.x,xmax=x+se.x),height=0.1)

If you want to calculate, say, 95% confidence instead of std. error, replace

f <- function(z)sd(z)/sqrt(length(z)) # function to calculate std.err

with

f <- function(z) qt(0.025,df=length(z)-1, lower.tail=F)* sd(z)/sqrt(length(z)) 

这篇关于R - 将质心添加到散点图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆