在R中控制ggplot2中的点的顺序? [英] controlling order of points in ggplot2 in R?
问题描述
假设我在R中绘制ggplot2中的密集散点图,其中每个点可以用不同的颜色标记:
df < - data.frame(x = rnorm(500))
df $ y = rnorm(500)* 0.1 + df $ x
df $ label< -c(a)
df $ label [50] < - point
df $ size < - 2
ggplot(df)+ geom_point(aes(x = x,y = y,color = label,size = size))
当我这样做时,散点标记为point(绿色)绘制在标有a的红点上。什么控制这个ggplot中的z排序,即什么控制哪个点位于哪个点上?例如,如果我希望所有a点都位于标记为点的所有点之上(意味着它们有时会部分或完全隐藏该点),该怎么办?这是否依赖于标签的字母数字顺序?我想找到一个可以轻松转换为rpy2的解决方案。谢谢
ggplot2
每层,绘图顺序由 geom
类型定义。默认情况下是按照它们出现在 data
中的顺序绘制。
在不同的情况下,它是指出。例如
geom_line
和
geom_path
以数据顺序连接观察值
还有关于
因素
排序的已知问题,有趣的是要注意包作者Hadley的回应
绘图的显示应该与数据框的顺序不变 - 其他任何都是错误。
记住这个引用,图层按照指定的顺序绘制,所以重叠绘图可能是一个问题,特别是在创建密集散点图时。所以如果你想要一个一致的情节(而不是依赖于数据框中的顺序的情节),你需要多想一想。
创建第二层
如果您希望某些值出现在其他值之上,您可以使用
子集
参数来创建第二层,以后肯定会绘制出来。您将需要显式地加载plyr
包,以便。()
可以工作。set.seed(1234)
df< - data.frame(x = rnorm(500))
df $ y = rnorm (500)* 0.1 + df $ x
df $ label< -c(a)
df $ label [50]< - point
df $ size< - 2
library(plyr)
ggplot(df)+ geom_point(aes(x = x,y = y,color = label,size = size))+
geom_point(aes(x = x,y = y,color = label,size = size),
subset =。(label =='point'))
更新
在
ggplot2_2.0.0
中,<$弃用c $ c> subset 参数。使用例如base :: subset
来选择data
参数中指定的相关数据。并且无需加载plyr
:ggplot(df)+
geom_point(aes(x = x,y = y,color = label,size = size))+
geom_point(data = subset(df,label =='point'),
aes (x = x,y = y,color = label,size = size))
或使用
alpha
另一种避免重叠绘图问题的方法是设置点的
alpha
(透明度)。这不会像上面显式的第二层方法那样有效,但是,通过明智地使用scale_alpha_manual
,您应该能够得到某些工作。
eg
#set alpha = 1(no transparency)for your point(s)of利息
#和低值否则
ggplot(df)+ geom_point(aes(x = x,y = y,color = label,size = size,alpha = label))+
scale_alpha_manual(guide ='none',values = list(a = 0.2,point = 1))
Suppose I'm plotting a dense scatter plot in ggplot2 in R where each point might be labeled by a different color:
df <- data.frame(x=rnorm(500)) df$y = rnorm(500)*0.1 + df$x df$label <- c("a") df$label[50] <- "point" df$size <- 2 ggplot(df) + geom_point(aes(x=x, y=y, color=label, size=size))
When I do this, the scatter point labeled "point" (green) is plotted on top of the red points which have the label "a". What controls this z ordering in ggplot, i.e. what controls which point is on top of which? For example, what if I wanted all the "a" points to be on top of all the points labeled "point" (meaning they would sometimes partially or fully hide that point)? Does this depend on alphanumerical ordering of labels? I'd like to find a solution that can be translated easily to rpy2. thanks
解决方案
ggplot2
will create plots layer-by-layer and within each layer, the plotting order is defined by thegeom
type. The default is to plot in the order that they appear in thedata
.Where this is different, it is noted. For example
geom_line
Connect observations, ordered by x value.
and
geom_path
Connect observations in data order
There are also known issues regarding the ordering of
factors
, and it is interesting to note the response of the package author HadleyThe display of a plot should be invariant to the order of the data frame - anything else is a bug.
This quote in mind, a layer is drawn in the specified order, so overplotting can be an issue, especially when creating dense scatter plots. So if you want a consistent plot (and not one that relies on the order in the data frame) you need to think a bit more.
Create a second layer
If you want certain values to appear above other values, you can use the
subset
argument to create a second layer to definitely be drawn afterwards. You will need to explicitly load theplyr
package so.()
will work.set.seed(1234) df <- data.frame(x=rnorm(500)) df$y = rnorm(500)*0.1 + df$x df$label <- c("a") df$label[50] <- "point" df$size <- 2 library(plyr) ggplot(df) + geom_point(aes(x = x, y = y, color = label, size = size)) + geom_point(aes(x = x, y = y, color = label, size = size), subset = .(label == 'point'))
Update
In
ggplot2_2.0.0
, thesubset
argument is deprecated. Use e.g.base::subset
to select relevant data specified in thedata
argument. And no need to loadplyr
:ggplot(df) + geom_point(aes(x = x, y = y, color = label, size = size)) + geom_point(data = subset(df, label == 'point'), aes(x = x, y = y, color = label, size = size))
Or use
alpha
Another approach to avoid the problem of overplotting would be to set the
alpha
(transparancy) of the points. This will not be as effective as the explicit second layer approach above, however, with judicious use ofscale_alpha_manual
you should be able to get something to work.eg
# set alpha = 1 (no transparency) for your point(s) of interest # and a low value otherwise ggplot(df) + geom_point(aes(x=x, y=y, color=label, size=size,alpha = label)) + scale_alpha_manual(guide='none', values = list(a = 0.2, point = 1))
这篇关于在R中控制ggplot2中的点的顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!