将文本注释添加到聚类散点图(tSNE) [英] Adding text annotation to a clustering scatter plot (tSNE)
问题描述
我要scatter plot
拥有XY
数据(高维数据的2D tSNE
嵌入).数据被分配给多个cluster
,所以我想用cluster
对点进行颜色编码,然后为每个cluster
添加一个标签,该标签具有与cluster
相同的颜色编码,并且位于cluster
的位置之外(尽可能).
I have XY
data (a 2D tSNE
embedding of high dimensional data) which I'd like to scatter plot
. The data are assigned to several cluster
s, so I'd like to color code the points by cluster
and then add a single label for each cluster
, that has the same color coding as the cluster
s, and is located outside (as much as possible) from the cluster
's points.
有什么想法如何在ggplot2
和ggrepel
或plotly
中使用R
来做到这一点?
Any idea how to do this using R
in either ggplot2
and ggrepel
or plotly
?
这是示例数据(XY
坐标和cluster
分配在df
中,标签在label.df
中)和其中的ggplot2
部分:
Here's the example data (the XY
coordinates and cluster
assignments are in df
and the labels in label.df
) and the ggplot2
part of it:
library(dplyr)
library(ggplot2)
set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)
label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))
ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none")
推荐答案
ggrepel
软件包中的geom_label_repel()
函数使您可以轻松地将标签添加到绘图中,同时尝试排斥"标签以免与其他元素不重叠.对您现有代码的一点补充,我们在其中汇总数据/获取放置标签的位置的坐标(在这里,我选择了每个群集的左上角区域-这是x的最小值和y的最大值)并将其合并现有数据包含集群标签.在对geom_label_repel()
的调用中指定此数据帧,并在aes()
中指定包含label
美学的变量.
The geom_label_repel()
function in the ggrepel
package allows you to easily add labels to plots while trying to "repel" the labels from not overlapping with other elements. A slight addition to your existing code where we summarize the data / get coordinates of where to put the labels (here I chose the upper left'ish region of each cluster - which is the min of x and the max of y) and merge it with your existing data containing the cluster labels. Specify this data frame in the call to geom_label_repel()
and specify the variable that contains the label
aesthetic in aes()
.
library(dplyr)
library(ggplot2)
library(ggrepel)
set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)
label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))
label.df_2 <- df %>%
group_by(cluster) %>%
summarize(x = min(x), y = max(y)) %>%
left_join(label.df)
ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none") +
ggrepel::geom_label_repel(data = label.df_2, aes(label = label))
这篇关于将文本注释添加到聚类散点图(tSNE)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!