带有ggplot2的dplyr链中的子集/过滤器 [英] Subset/filter in dplyr chain with ggplot2

查看:101
本文介绍了带有ggplot2的dplyr链中的子集/过滤器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想按照的线条制作斜边图(无双关) 。理想情况下,我想以dplyr样式的链条进行所有操作,但是当我尝试对数据进行子集添加特定的 geom_text 标签时遇到了麻烦。这是一个玩具示例:

I'd like to make a slopegraph, along the lines (no pun intended) of this. Ideally, I'd like to do it all in a dplyr-style chain, but I hit a snag when I try to subset the data to add specific geom_text labels. Here's a toy example:

# make tbl:

df <- tibble(
  area = rep(c("Health", "Education"), 6),
  sub_area = rep(c("Staff", "Projects", "Activities"), 4),
  year = c(rep(2016, 6), rep(2017, 6)),
  value = rep(c(15000, 12000, 18000), 4)
) %>% arrange(area)


# plot: 

df %>% filter(area == "Health") %>% 
  ggplot() + 
  geom_line(aes(x = as.factor(year), y = value, 
            group = sub_area, color = sub_area), size = 2) + 
  geom_point(aes(x = as.factor(year), y = value, 
            group = sub_area, color = sub_area), size = 2) +
  theme_minimal(base_size = 18) + 
  geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"), 
  aes(x = as.factor(year), y = value, 
  color = sub_area, label = area), size = 6, hjust = 1)

但这给我 filter_( .data,.dots = lazyeval :: lazy _dots(...)):找不到
对象。使用子集而不是 dplyr :: filter 给了我类似的错误。我在SO / Google上发现的是这个问题,该问题解决了一个稍微不同的问题。

But this gives me Error in filter_(.data, .dots = lazyeval::lazy_dots(...)) : object '.' not found. Using subset instead of dplyr::filter gives me a similar error. What I've found on SO/Google is this question, which addresses a slightly different problem.

像这样在链中对数据进行子集化的正确方法是什么?

What is the correct way to subset the data in a chain like this?

编辑:我的代表是一个简化的示例,在实际工作中我只有一条长链。下面的Mike注释适用于第一种情况,但不适用于第二种情况。

Edit: My reprex is a simplified example, in the real work I have one long chain. Mike's comment below works for the first case, but not the second.

推荐答案

如果将绘图代码包装在<$ c $中c> {...} ,您可以使用确切指定先前计算结果的插入位置:

If you wrap the plotting code in {...}, you can use . to specify exactly where the previously calculated results are inserted:

library(tidyverse)

df <- tibble(
  area = rep(c("Health", "Education"), 6),
  sub_area = rep(c("Staff", "Projects", "Activities"), 4),
  year = c(rep(2016, 6), rep(2017, 6)),
  value = rep(c(15000, 12000, 18000), 4)
) %>% arrange(area)

df %>% filter(area == "Health") %>% {
    ggplot(.) +    # add . to specify to insert results here
        geom_line(aes(x = as.factor(year), y = value, 
                      group = sub_area, color = sub_area), size = 2) + 
        geom_point(aes(x = as.factor(year), y = value, 
                       group = sub_area, color = sub_area), size = 2) +
        theme_minimal(base_size = 18) + 
        geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"),    # and here
                  aes(x = as.factor(year), y = value, 
                      color = sub_area, label = area), size = 6, hjust = 1)
}

虽然该图可能不是您真正想要的,至少它可以运行,因此您可以对其进行编辑。

While that plot is probably not what you really want, at least it runs so you can edit it.

发生了什么:通常%>%会通过结果左侧(LHS)到右侧(RHS)的第一个参数的值。但是,如果将RHS括在括号中,则%>%只会将结果传递到明确放置的位置。。此公式对于嵌套的子管道或其他复杂的调用(如ggplot链)很有用,这些调用否则只能通过使用重定向。才能解决。有关更多详细信息和选项,请参见 help('%>%','magrittr')

What's happening: Normally %>% passes the results of the left-hand side (LHS) to the first parameter of the right-hand side (RHS). However, if you wrap the RHS in braces, %>% will only pass the results in to wherever you explicitly put a .. This formulation is useful for nested sub-pipelines or otherwise complicated calls (like a ggplot chain) that can't otherwise be sorted out just by redirecting with a .. See help('%>%', 'magrittr') for more details and options.

这篇关于带有ggplot2的dplyr链中的子集/过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆