结合facet_grid()或facet_wrap()将带有美元符号($)的变量传递给aes()时出现的问题 [英] Issue when passing variable with dollar sign notation ($) to aes() in combination with facet_grid() or facet_wrap()

查看:90
本文介绍了结合facet_grid()或facet_wrap()将带有美元符号($)的变量传递给aes()时出现的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在一个项目中用ggplot2做一些分析,但偶然地,我偶然发现了一些我无法解释的奇怪行为(对我来说).当我写aes(x = cyl, ...)时,该图看起来与使用aes(x = mtcars$cyl, ...)传递相同变量时的图不同.当我删除facet_grid(am ~ .)时,两个图再次相同.下面的代码是根据我的项目中产生相同行为的代码建模的:

I am doing some analysis in ggplot2 at the moment for a project and by chance I stumbled across some (for me) weird behavior that I cannot explain. When I write aes(x = cyl, ...) the plot looks different to what it does if I pass the same variable using aes(x = mtcars$cyl, ...). When I remove facet_grid(am ~ .) both graphs are the same again. The code below is modeled after the code in my project that generates the same behavior:

library(dplyr)
library(ggplot2)

data = mtcars

test.data = data %>%
  select(-hp)


ggplot(test.data, aes(x = test.data$cyl, y = mpg)) +
  geom_point() + 
  facet_grid(am ~ .) +
  labs(title="graph 1 - dollar sign notation")

ggplot(test.data, aes(x = cyl, y = mpg)) +
  geom_point()+ 
  facet_grid(am ~ .) +
  labs(title="graph 2 - no dollar sign notation")

这是图1的图片:

这是图2的图片:

我发现我可以使用aes_string而不是aes并通过将变量名称作为字符串传递来解决此问题,但是我想了解ggplot为何采用这种方式.使用facet_wrap进行类似尝试时也会发生此问题.

I found that I can work around this problem using aes_string instead of aes and passing the variable names as strings, but I would like to understand why ggplot is behaving that way. The problem also occurs in similar attempts with facet_wrap.

非常感谢您提前提供任何帮助!如果我不正确地理解这一点,我会感到非常不舒服.

Thx a lot for any help in advance! I feel very uncomfortable if I do not understand that properly...

推荐答案

tl; dr

从不aes()中使用[$.

考虑这个说明性示例,其中刻面变量f相对于x故意以非显而易见的顺序排列

Consider this illustrative example where the facetting variable f is purposely in a non-obvious order with respect to x

d <- data.frame(x=1:10, f=rev(letters[gl(2,5)]))

现在对比一下这两个图会发生什么,

Now contrast what happens with these two plots,

p1 <- ggplot(d) +
  facet_grid(.~f, labeller = label_both) +
  geom_text(aes(x, y=0, label=x, colour=f)) +
  ggtitle("good mapping") 

p2 <- ggplot(d) +
  facet_grid(.~f, labeller = label_both) +
  geom_text(aes(d$x, y=0, label=x, colour=f)) +
  ggtitle("$ corruption") 

通过查看ggplot2内部为每个面板创建的data.frame,我们可以更好地了解发生的情况,

We can get a better idea of what's happening by looking at the data.frame created internally by ggplot2 for each panel,

 ggplot_build(p1)[["data"]][[1]][,c("x","PANEL")]

    x PANEL
1   6     1
2   7     1
3   8     1
4   9     1
5  10     1
6   1     2
7   2     2
8   3     2
9   4     2
10  5     2

 ggplot_build(p2)[["data"]][[1]][,c("x", "PANEL")]

    x PANEL
1   1     1
2   2     1
3   3     1
4   4     1
5   5     1
6   6     2
7   7     2
8   8     2
9   9     2
10 10     2

第二个图具有错误的映射,因为当ggplot为每个面板创建一个data.frame时,它将以错误"的顺序选择x值.

The second plot has the wrong mapping, because when ggplot creates a data.frame for each panel, it picks x values in the "wrong" order.

之所以会发生这种情况,是因为使用$会断开要映射的各个变量之间的链接(ggplot必须假定它是一个独立变量,它所知道的一切可能来自任意,断开连接的源).由于本示例中的data.frame没有按照因子f排序,因此每个面板内部使用的子集data.frame都采用错误的顺序.

This occurs because the use of $ breaks the link between the various variables to be mapped (ggplot must assume it's an independent variable, which for all it knows could come from an arbitrary, disconnected source). Since the data.frame in this example is not ordered according to the factor f, the subset data.frames used internally for each panel assume the wrong order.

这篇关于结合facet_grid()或facet_wrap()将带有美元符号($)的变量传递给aes()时出现的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆