控制ggplot2中点的顺序? [英] Controlling the order of points in ggplot2?

查看:29
本文介绍了控制ggplot2中点的顺序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 ggplot2 中绘制一个密集散点图,其中每个点可能用不同的颜色标记:

I'm plotting a dense scatter plot in ggplot2 where each point might be labeled by a different color:

df <- data.frame(x=rnorm(500))
df$y = rnorm(500)*0.1 + df$x
df$label <- c("a")
df$label[50] <- "point"
df$size <- 2

ggplot(df) + geom_point(aes(x=x, y=y, color=label, size=size))

当我这样做时,标记为点"的散点将消失.(绿色)绘制在带有标签a"的红点之上.什么控制了 ggplot 中的 z 排序,即什么控制了哪个点在哪个点之上?

When I do this, the scatter point labeled "point" (green) is plotted on top of the red points which have the label "a". What controls this z ordering in ggplot, i.e. what controls which point is on top of which?

例如,如果我想要所有的a"怎么办?点位于所有标记为点"的点之上(意味着他们有时会部分或完全隐藏这一点)?这是否取决于标签的字母数字顺序?

For example, what if I wanted all the "a" points to be on top of all the points labeled "point" (meaning they would sometimes partially or fully hide that point)? Does this depend on alphanumerical ordering of labels?

我想找到一个可以轻松转换为 rpy2 的解决方案.

I'd like to find a solution that can be translated easily to rpy2.

推荐答案

ggplot2 将逐层创建绘图,在每一层内,绘图顺序由 geom 类型.默认是按照它们在 data 中出现的顺序绘制.

ggplot2 will create plots layer-by-layer and within each layer, the plotting order is defined by the geom type. The default is to plot in the order that they appear in the data.

哪里不同,请注明.例如

Where this is different, it is noted. For example

连接观察值,按 x 值排序.

geom_line

Connect observations, ordered by x value.

按数据顺序连接观察


还有关于因子的排序的已知问题,并且值得注意的是包作者 Hadley 的回复


There are also known issues regarding the ordering of factors, and it is interesting to note the response of the package author Hadley

绘图的显示应该与数据框的顺序保持不变 - 其他任何东西都是错误.

The display of a plot should be invariant to the order of the data frame - anything else is a bug.


记住这句话,图层是按照指定的顺序绘制的,因此过度绘制可能是一个问题,尤其是在创建密集散点图时.因此,如果您想要一个一致的图(而不是依赖于数据框中顺序的图),您需要多考虑一下.


This quote in mind, a layer is drawn in the specified order, so overplotting can be an issue, especially when creating dense scatter plots. So if you want a consistent plot (and not one that relies on the order in the data frame) you need to think a bit more.

如果您希望某些值出现在其他值之上,您可以使用 subset 参数来创建第二个图层,以便在之后绘制.您需要显式加载 plyr 包,以便 .() 工作.

If you want certain values to appear above other values, you can use the subset argument to create a second layer to definitely be drawn afterwards. You will need to explicitly load the plyr package so .() will work.

set.seed(1234)
df <- data.frame(x=rnorm(500))
df$y = rnorm(500)*0.1 + df$x
df$label <- c("a")
df$label[50] <- "point"
df$size <- 2
library(plyr)
ggplot(df) + geom_point(aes(x = x, y = y, color = label, size = size)) +
  geom_point(aes(x = x, y = y, color = label, size = size), 
             subset = .(label == 'point'))

ggplot2_2.0.0 中,不推荐使用 subset 参数.使用例如base::subset 选择在 data 参数中指定的相关数据.并且不需要加载 plyr:

In ggplot2_2.0.0, the subset argument is deprecated. Use e.g. base::subset to select relevant data specified in the data argument. And no need to load plyr:

ggplot(df) +
  geom_point(aes(x = x, y = y, color = label,  size = size)) +
  geom_point(data = subset(df, label == 'point'),
             aes(x = x, y = y, color = label, size = size))


或者使用alpha

避免过度绘图问题的另一种方法是设置点的 alpha(透明度).这不会像上面明确的第二层方法那样有效,但是,通过明智地使用 scale_alpha_manual,你应该能够得到一些工作.


Or use alpha

Another approach to avoid the problem of overplotting would be to set the alpha (transparancy) of the points. This will not be as effective as the explicit second layer approach above, however, with judicious use of scale_alpha_manual you should be able to get something to work.

例如

# set alpha = 1 (no transparency) for your point(s) of interest
# and a low value otherwise
ggplot(df) + geom_point(aes(x=x, y=y, color=label, size=size,alpha = label)) + 
  scale_alpha_manual(guide='none', values = list(a = 0.2, point = 1))

这篇关于控制ggplot2中点的顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆