为现有情节绘制额外点的有效方式 [英] Effective way of plotting additional points to an exisiting plot

查看:121
本文介绍了为现有情节绘制额外点的有效方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的例子中,有100个独特的( X Y )点,每个点都有一个ID并属于一个类型。在这100分中,20分有三个其他类型的值( CT D OP )。

以下是数据生成过程:

  df < - data.frame(X = rnorm(100,0,1),Y = rnorm(100,0,1),
ID = paste(rep(ID,100),1:100,sep = (c('Type1','Type2'),30),
rep(c()),
Type = rep(ID,100),
Val = c ('Type3','Type4'),20)))

随机选取20分c $ c> sample(1:100,20))将具有为点添加额外信息的值。在这个额外的 Type 中,所有这20点都会有 Type ==ID中的信息。

  dat1 < -  data.frame(Type = rep('CT',20),
Val = paste (样本(1:6),样本(1:6,20,取代= T),样本(CT,20) (type = rep('D',20),1:3],dat1)

dat2 < - data.frame(Type = rep('D',20),
Val = (样本(1:100,20)),样本(1:6,20,替换= T),sep =_))
dat2 < - cbind ,1:3],dat2)

dat3< - data.frame(Type = rep('OP',20),
Val = paste(rep(OP,20 ),
sample(1:6,20,replace = T),sep =_))
dat3 < - cbind(df [sample(1:100,20),1:3 ],dat3)

df < - rbind(df,dat1,dat2,dat3)

现在,绘制 D_1 D_4 =D。

  df%>%filter(Val%in%c(' D_1','D_4'))%>%
ggplot(aes(X,Y,col = Val) )+ geom_point()+ geom_text(aes(label = ID))



注意:我已添加ID geom_text(aes(label = ID))



对于这个现有的绘图,我必须添加剩余的92个点,这些点不会超过两个值或没有值所有。我尝试在Hadley



问题:


  1. 如何在单个命令中或以优雅的方式绘制选定点和附加点?

  2. 任何可能的 dplyr 方法可用于上述命令中?



< 更新 df $ Type ==ID非常重要,因为它是一个只绘制一次剩余点的绘图。否则,其中一些点的值在 CT D OP 导致重复绘图。

  df%>%count(X,Y)%>%排列(desc(n))
##A tibble:100 x 3
#XY n
#< dbl> < DBL> < INT>
#1 -0.86266147 2.0368626 4
#2 -0.61770678 0.4428537 4
#3 1.32441957 -0.9388095 4
#4 -1.65650319 -0.1551399 3
#5 -0.99946809 1.1791395 3
#6 -0.52881072 0.1742483 3
#7 -0.25892382 0.1380577 3
#8 -0.19239410 0.5269329 3
#9 -0.09709764 -0.4855484 3
#10 -0.05977874 0.1771422 3
#... ...有90多行

看起来,前三点使用相同的X,Y值的值为类型 ID,CT,D,OP。但是这些点只需要绘制一次。
<要解决第一条评论:由于数据中有多行具有相同的 X Y 坐标。您可以使用下面的代码删除重复的点。我们首先根据 Val 的顺序订购积分,以便重复数据来自其他积分,而不是来自 D_1 D_4 分数(尽管如果您的实际数据包含 D_1 D_4 点具有相同的 X Y 坐标,只绘制 D_1 点)。 > ggplot(df%>%
mutate(Val = fct_other(Val,keep = c(D_1,D_4)))%>%
arrange(Val)%>%
filter(!duplicated(。[,c(X,Y)])),
aes(X,Y,col = Val,size = Val))+
geom_point()+
scale_colour_manual(values = c(D_1 = hcl(15,100,65),D_4 = hcl(195,100,65),Other =grey70))+
scale_size_manual(values = c D_1 = 3,D_4 = 3,其他= 1))+
theme_bw()



如果你想绘制所有 D_1 D_4 点,即使它们具有相同的 X Y 坐标,您可以这样做:

  df%>%
mutate(Val = fct_other(Val,keep = c( ((c(1,diff(X))!= 0& amp;(%)}
filter(X,Y,Val)%>%
filter c(1,diff(Y))!= 0)| Val!='Other')

然后您可以使用不同的标记点大小以确保超过code> D_1 和 D_4 点都可见。



原始答案



如何合并 Val 这样的所有其他级别:

  library(tidyverse)
library(forcats)

ggplot(df%>%mutate(Val = fct_other(Val,keep = c)(D_1 =h_(d_4))),aes(X,Y,col = Val))+
geom_point()+
scale_colour_manual(values = c(D_1 = hcl(15,100,65 ),D_4 = hcl(195,100,65),其他=grey70))+
theme_bw()



你也可以使用size来让所需的点更突出。对于这个特定的数据集,这种方法还可以确保我们可以看到隐藏的 D_1 D_4

  ggplot(df%>%mutate(Val = fct_other(Val,keep = c) (D_1,D_4))),aes(X,Y,col = Val,size = Val))+ 
geom_point()+
scale_colour_manual(values = c(D_1 = hcl (D = 4,D_4 = 3,其他= 1))+
theme_bw()


In my case, there are 100 unique (X, Y) points with each having an ID and belongs a Type. In these 100 points, 20 points have values for three other Types (CT,D,OP).

Here is the data generation process:

df <- data.frame(X=rnorm(100,0,1), Y=rnorm(100,0,1), 
                 ID=paste(rep("ID", 100), 1:100, sep="_"),
                 Type=rep("ID",100),
                 Val=c(rep(c('Type1','Type2'),30),
                       rep(c('Type3','Type4'),20)))

Randomly selected 20 points (sample(1:100,20)) will have values which add extra information to the points. All these 20 points in this extra Type will have information in Type=="ID".

dat1 <- data.frame(Type=rep('CT',20),
                   Val=paste(rep("CT", 20), 
                             sample(1:6,20,replace=T), sep="_"))
dat1 <- cbind(df[sample(1:100,20),1:3],dat1)

dat2 <- data.frame(Type=rep('D',20),
                   Val=paste(rep("D", 20), 
                             sample(1:6,20,replace=T), sep="_"))
dat2 <- cbind(df[sample(1:100,20),1:3],dat2)

dat3 <- data.frame(Type=rep('OP',20),
                   Val=paste(rep("OP", 20), 
                             sample(1:6,20,replace=T), sep="_"))
dat3 <- cbind(df[sample(1:100,20),1:3],dat3)

df <- rbind(df, dat1, dat2, dat3)

Now, plotting the points having D_1,D_4 values for Type=="D".

df %>% filter(Val %in% c('D_1','D_4')) %>% 
  ggplot(aes(X,Y,col=Val)) + geom_point() + geom_text(aes(label=ID))

Note: I have added IDs geom_text(aes(label=ID)) only for illustartion purposes.

To this, existing plot, I have to add remaining 92 points which do not have above two values or no values at all. I have tried adding additional points to an existing approach mentioned by Hadley here:

p <- df %>% filter(Val %in% c('D_1','D_4')) %>% ggplot(aes(X,Y,col=Val)) + geom_point() 

p + geom_point(data=df[(!df$ID %in% df$ID[df$Val %in% c('D_1','D_4')]) & df$Type=="ID",],
               colour="grey")

Questions:

  1. How to plot selected points and additional points in a single command or in an elegant way possible?

  2. Is there any possible dplyr approach which can be used in above command?

update: df$Type=="ID" is very important as it allows plotting of the remaining points only once. Otherwise, some of these points having values in either CT or D or OP leads to duplicated plotting.

df %>% count(X,Y) %>% arrange(desc(n))
# # A tibble: 100 x 3
#             X          Y     n
#         <dbl>      <dbl> <int>
# 1 -0.86266147  2.0368626     4
# 2 -0.61770678  0.4428537     4
# 3  1.32441957 -0.9388095     4
# 4 -1.65650319 -0.1551399     3
# 5 -0.99946809  1.1791395     3
# 6 -0.52881072  0.1742483     3
# 7 -0.25892382  0.1380577     3
# 8 -0.19239410  0.5269329     3
# 9 -0.09709764 -0.4855484     3
# 10 -0.05977874  0.1771422     3
# # ... with 90 more rows

Looks like, first three points with the same X, Y values have values for Type ID, CT, D, OP. But these points need to be plotted only once.

解决方案

Updated Answer

To address the first comment: Some points are plotted more than once because there are multiple rows in the data with the same X Y coordinates. You can remove duplicate points using the code below. We first order the points based on the ordering of Val so that the duplicates will come from Other points, rather than from D_1 or D_4 points (though if your real data contains cases where a D_1 and a D_4 point have the same X and Y coordinates, only the D_1 point will be plotted).

ggplot(df %>% 
         mutate(Val=fct_other(Val,keep=c("D_1","D_4"))) %>% 
         arrange(Val) %>% 
         filter(!duplicated(.[,c("X","Y")])), 
       aes(X,Y,col=Val, size=Val)) + 
  geom_point() +
  scale_colour_manual(values=c(D_1=hcl(15,100,65),D_4=hcl(195,100,65),Other="grey70")) +
  scale_size_manual(values=c(D_1=3, D_4=3, Other=1)) +
  theme_bw() 

If you want to plot all D_1 and D_4 points, even if they have the same X and Y coordinates, you could do this:

df %>% 
   mutate(Val=fct_other(Val,keep=c("D_1","D_4"))) %>% 
   arrange(X, Y, Val) %>% 
   filter((c(1,diff(X)) != 0 & c(1, diff(Y)) !=0) | Val != 'Other')

Then you could use different point marker sizes to ensure that overplotted D_1 and D_4 points are both visible.

Original Answer

What about collapsing all the other levels of Val like this:

library(tidyverse)
library(forcats)

ggplot(df %>% mutate(Val=fct_other(Val,keep=c("D_1","D_4"))), aes(X,Y,col=Val)) + 
  geom_point() +
  scale_colour_manual(values=c(D_1=hcl(15,100,65),D_4=hcl(195,100,65),Other="grey70")) +
  theme_bw()

You could also use size to make the desired points stand out more. For this particular data set, this approach also ensures that we can see a couple of D_1 and D_4 points that were hidden behind grey points in the previous plot.

ggplot(df %>% mutate(Val=fct_other(Val,keep=c("D_1","D_4"))), aes(X,Y,col=Val, size=Val)) + 
  geom_point() +
  scale_colour_manual(values=c(D_1=hcl(15,100,65),D_4=hcl(195,100,65),Other="grey70")) +
  scale_size_manual(values=c(D_1=3, D_4=3, Other=1)) +
  theme_bw()

这篇关于为现有情节绘制额外点的有效方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆