将文本注释添加到聚类散点图(tSNE) [英] Adding text annotation to a clustering scatter plot (tSNE)

查看:581
本文介绍了将文本注释添加到聚类散点图(tSNE)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要scatter plot拥有XY数据(高维数据的2D tSNE嵌入).数据被分配给多个cluster,所以我想用cluster对点进行颜色编码,然后为每个cluster添加一个标签,该标签具有与cluster相同的颜色编码,并且位于cluster的位置之外(尽可能).

I have XY data (a 2D tSNE embedding of high dimensional data) which I'd like to scatter plot. The data are assigned to several clusters, so I'd like to color code the points by cluster and then add a single label for each cluster, that has the same color coding as the clusters, and is located outside (as much as possible) from the cluster's points.

有什么想法如何在ggplot2ggrepelplotly中使用R来做到这一点?

Any idea how to do this using R in either ggplot2 and ggrepel or plotly?

这是示例数据(XY坐标和cluster分配在df中,标签在label.df中)和其中的ggplot2部分:

Here's the example data (the XY coordinates and cluster assignments are in df and the labels in label.df) and the ggplot2 part of it:

library(dplyr)
library(ggplot2)
set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)

label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))

ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none")

推荐答案

ggrepel软件包中的geom_label_repel()函数使您可以轻松地将标签添加到绘图中,同时尝试排斥"标签以免与其他元素不重叠.对您现有代码的一点补充,我们在其中汇总数据/获取放置标签的位置的坐标(在这里,我选择了每个群集的左上角区域-这是x的最小值和y的最大值)并将其合并现有数据包含集群标签.在对geom_label_repel()的调用中指定此数据帧,并在aes()中指定包含label美学的变量.

The geom_label_repel() function in the ggrepel package allows you to easily add labels to plots while trying to "repel" the labels from not overlapping with other elements. A slight addition to your existing code where we summarize the data / get coordinates of where to put the labels (here I chose the upper left'ish region of each cluster - which is the min of x and the max of y) and merge it with your existing data containing the cluster labels. Specify this data frame in the call to geom_label_repel() and specify the variable that contains the label aesthetic in aes().

library(dplyr)
library(ggplot2)
library(ggrepel)

set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)

label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))
label.df_2 <- df %>% 
  group_by(cluster) %>% 
  summarize(x = min(x), y = max(y)) %>% 
  left_join(label.df)

ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none") +
  ggrepel::geom_label_repel(data = label.df_2, aes(label = label))

这篇关于将文本注释添加到聚类散点图(tSNE)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆