geom_vline的行为是否与其他ggplot geoms的行为不一致? [英] Is behavior of geom_vline inconsistent with behavior of other ggplot geoms?

查看:178
本文介绍了geom_vline的行为是否与其他ggplot geoms的行为不一致?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

与其他 ggplot geoms相比,似乎 geom_vline 不具备颜色美学的正确行为。我试图弄清楚我是不是误解了 geom_vline 的一些内容,或者这是 geom_vline $ b

 #插图的假数据
dat = data.frame(x = rnorm(60),y = rep(LETTERS [1:3],20))

所有这些都按预期工作:

 #x与垂直中线的密度图
ggplot(data = dat)+
geom_density(aes (x = x))+
geom_vline(aes(xintercept = median(x)))

#exp(x)与垂直中线的密度图
ggplot(data (x)= dat)+
geom_density(aes(x = exp(x)))+
geom_vline(aes(xintercept = median(exp(x))))

# (x)的密度图,每个级别的y
ggplot(data = dat)+
geom_density(aes(x = exp(x),color = y))



然而,下面的情节工作不同。我期望下面的图中的第二个 geom_vline 语句为 y 的每个级别包含单独的中间线。但实际上,它只是在所有 x 的值的中间值处添加一行(正如它与第一个 geom_vline 语句)。

 #对于y 
ggplot(data = dat)+
每个级别的x单独密度图geom_density(aes(x = x,color = y))+
geom_vline(aes(xintercept = median(x)),lwd = 4,color =black)+
geom_vline(aes =中位数(x),颜色= y),lwd = 1)

#x的密度图,按等级y
ggplot(data = dat)+
geom_density (aes(x = x,color = y))+
geom_vline(aes(xintercept = median(x)),lwd = 4,color =black)+
geom_vline(aes(xintercept =中位数(x),颜色= y),lwd = 1)+
facet_grid(。〜y)



它看起来像 geom_vline 的行为与通常的 ggplot 逻辑的行为不同。例如,如上所示,我可以将数据函数 exp(x)传递给 geom_density ,并且当包含色彩审美时,它会为 y 的每个等级返回单独的密度图。另外,只要没有颜色美学,我可以传递数据的函数, exp(x) median(exp(x) ),到 geom_vline ,它的行为也如预期。但是,当我尝试使用颜色审美或使用 geom_vline 构面时,它无法为颜色的每个级别提供单独的中线。变量,而是在所有 x 值中为中值添加一行。



我知道我可以将预先汇总的数据传递给 geom_vline 以获得我想要的行为(实际上,应答 this SO问题是什么引发了这里讨论的问题),但我试图了解 geom_vline 相对于其他<$ c的行为是否存在不一致$ c $> ggplot geoms。



我是否缺少某些东西或是 geom_vline 不同于其他 ggplot geoms?

解决方案



正确的,你取所有值的中位数 x ,这只是一个数字。换句话说,在整个数据集上评估 median(x),而不是针对每个组。你可以看到这个相同的行为,用一个简单的图表,它使用 geom_point 而不是 geom_vline

  qplot(x,median(x),color = y,data = dat)


It seems like geom_vline does not behave "properly" with colour aesthetics when compared with other ggplot geoms. I'm trying to figure out whether I'm misunderstanding something about geom_vline or whether this is an oversight in the design of geom_vline.

# Fake data for illustration
dat=data.frame(x=rnorm(60), y=rep(LETTERS[1:3],20))

All of these work as expected:

# Density plot of x with vertical median line
ggplot(data=dat) + 
  geom_density(aes(x=x)) + 
  geom_vline(aes(xintercept=median(x)))

# Density plot of exp(x) with vertical median line
ggplot(data=dat) + 
  geom_density(aes(x=exp(x))) +
  geom_vline(aes(xintercept=median(exp(x))))

# Separate density plots of exp(x) for each level of y
ggplot(data=dat) + 
  geom_density(aes(x=exp(x), colour=y))

However, the plots below work differently. I expected the second geom_vline statement in the plots below to include a separate median line for each level of y. But in fact it just adds one line at the median of all values of x (as shown by the fact that it does the same thing as the first geom_vline statement).

# Separate density plots of x for each level of y
ggplot(data=dat) + 
  geom_density(aes(x=x, colour=y)) + 
  geom_vline(aes(xintercept=median(x)), lwd=4, colour="black") +
  geom_vline(aes(xintercept=median(x), colour=y), lwd=1)

# Density plot of x, faceted by level of y
ggplot(data=dat) + 
  geom_density(aes(x=x, colour=y)) + 
  geom_vline(aes(xintercept=median(x)), lwd=4, colour="black") +
  geom_vline(aes(xintercept=median(x), colour=y), lwd=1) + 
  facet_grid(. ~ y)

It seems like geom_vline is behaving differently than would be expected from the usual ggplot logic. For example, as shown above, I can pass a function of the data, exp(x), to geom_density and it returns separate density plots for each level of y when a colour aesthetic is included. In addition, as long as there's no colour aesthetic, I can pass a function of the data, exp(x) or median(exp(x)), to geom_vline and it also behaves as expected. But when I try to use a colour aesthetic or faceting with geom_vline, it fails to provide separate median lines for each level of the colour variable, instead adding a single line for the median over all of the x values.

I know I can pass pre-summarized data to geom_vline to get the behavior I want (in fact, answering this SO question is what raised the issues discussed here), but I'm trying to understand whether there's actually an inconsistency in the behavior of geom_vline relative to other ggplot geoms.

Am I missing something or is geom_vline really behaving differently than other ggplot geoms?

解决方案

"But in fact it just adds one line at the median of all values of x."

Right, you're taking the median of all values of x, which is just one number. In other words, median(x) is evaluated on the whole dataset, not for each group. You can see this same behavior with a simpler plot that uses geom_point rather than geom_vline:

qplot(x, median(x), color=y, data=dat)

这篇关于geom_vline的行为是否与其他ggplot geoms的行为不一致?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆